JavaScript for Data Science

by mrmagoo17on 4/25/21, 8:50 AMwith 75 comments
by beforeoliveson 4/25/21, 11:13 AM

I know that data science is a broad and somewhat vague term but this -

   We will cover:

    Core features of modern JavaScript

    Programming with callbacks and promises

    Creating objects and classes

    Writing HTML and CSS

    Creating interactive pages with React

    Building data services

    Testing

    Data visualization

    Combining everything to create a three-tier web application
- this isn't data science.

by danpalmeron 4/25/21, 11:24 AM

I don’t want to repeat the old and tired JavaScript hate, but this just isn’t a great idea.

I’d suggest that there are 3 important primitives for data science: flexible numeric types, fast math/algorithm libraries, and data manipulation being easy.

JavaScript doesn’t really have any of these. Numbers are 64bit floats only - no integers, no big numbers. There aren’t equivalents to Numpy/Pandas/Scikit Learn, and the lack of standard library and expressiveness in data manipulation in the language makes basic tasks harder.

JavaScript has its uses, but there’s really no reason to force data science be one of them.

by czepon 4/25/21, 4:05 PM

To address some of the skepticism about when and where javascript would be appropriate in data science, would you want to fit a logistic regression model in javascript? Probably not, but to build a solver that takes model outputs and visualizes the changes in predicted probabilities based on different combinations of variables? This is definitely where javascript would make sense. Visualization, dashboards, reporting, and exploratory analysis are all ripe domains for developing rich responsive UIs. Basically, any layer where you have a data-to-human interface can be leveraged with javascript.

There is a lot of great work happening in this space already. In the R world for example, shiny makes heavy use of js to the point that you often can't tell where R code ends and javascript begins. Plotly's Dash provides bindings for R, Python, and Julia. Personally, as a data scientist, I have been excitedly learning React because it really rips the landscape wide open for all the use cases I mentioned above. It then makes sense to have libraries that give JS users a good data model and can do most of the same numerical computation that we'd be doing in other languages. Again, you probabaly don't want to do serious numerical work in js, but remember people said that about Python ten years ago too.

I love the framing of this book, because I want more data scientists to start thinking about the presentation of data and spark some bits of ingenuity to make datasets and model outputs accessible to non-data scientists. Data scientists should be the ones writing the tools that interface data with humans because of their domain knowledge. But this is a different skillset and usually the work of SW engineers. Of course engineers can also have great data intuition too, but I really do encourage data scientists to develop their front end skills, it's well worth it.

by tharneon 4/25/21, 4:52 PM

I don't see the point of this. You already have a ubiquitous, easy-to-learn, high-level language that's great for data science, it's called python. If you're a JavaScript developer who wants to get into data science but are too lazy to learn python, you probably weren't that interested in data science in the first place.

Python definitely has some problems, but if you were going to have a new lingua franca for data science, it would probably be something like Julia, certainly not JavaScript.

by la_fayetteon 4/25/21, 11:45 AM

Data science is not a standardized term, however I don't get what specifically makes this text relevant for the domain of data science... For some data science projects one could surely use javascript, however in mamy cases one misses important libraries, for purposes such as statistical analysis, data manipulation, machine learning, ...

by genrezon 4/25/21, 2:58 PM

I am a noob to Javascript, so if someone knows better, than please correct me about this, but arrow functions aren't meant to replace normal function syntax, right? From [1], it seems like the main point of arrow syntax is to allow you to inherit the "this" parameter if you are inside a method. Meanwhile, you need normal function syntax if you are creating a constructor, making a method function for a prototype, or making generator functions. (I didn't even know javascript had generator functions until just now :))

So it seems a bit weird to me that they advocate using arrow function syntax instead of the regular syntax. They seem to be advocating using the new class syntax instead, so I guess they don't need the constructor or method creation features of the normal syntax, but I still don't see why they would specifically advocate for arrow function syntax. Is it faster? They say it interferes with other features, but which features?

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

by talolardon 4/25/21, 8:14 PM

As a data scientist who does more frontend, I think this is a really valuable concept. Hello by users/stakeholders engage with our work is the way to push it forward in the org and a dash of frontend can do wonders for getting that message across. It’s wonderful that people are making resources about the frontend for data scientists

by brianzelipon 4/25/21, 5:50 PM

Just putting this out there: stdlib - a standard library for js, https://stdlib.io/.

by mark_l_watsonon 4/25/21, 3:41 PM

I thought of writing a Javascript + tensor flow.js + NLP + web scraping + linked data + etc. book about a year ago. tensorflow.js is especially very cool: well documented with great examples. In fact, it was the great tensor flow.js examples and demos that convinced me to not write the book because I didn't feel like I could do much value add on that subject.

by splithalfon 4/25/21, 1:43 PM

Data scientists are the new webmasters.

by slt2021on 4/25/21, 2:24 PM

hard pass.

even python is not used for data science, all heavy lifting is done in C/fortran, and python is just a glue

by Rainymoodon 4/25/21, 1:37 PM

Really cool but no one needs this... as a data scientist learning javascript, teach me how to run data science models using javascript! That's where the real gold is... I'm even thinking of writing articles about this myself... JS is great for making things more tangible and interactive

by m00dyon 4/25/21, 12:39 PM

well, I was expecting training a neural network with web-assembly through gpu support in its last chapter :)

by temp8964on 4/25/21, 11:32 AM

They use data-forge.js, which has less stars than danfo.js.

I can't find any benchmark how they compare to data.table or pandas.

Without a dominant and high performance data frame library as a foundation, I wouldn't even try.

by jason0597on 4/25/21, 1:05 PM

Why on earth would you want to use JavaScript for Data Science?