Ask HN: JavaScript machine learning library?

mayank · on Sept 18, 2010

A point and an idea:

(1) Modern classification algorithms like SVMs need some pretty hardcore math routines (SVMs require Quadratic Programming, which isn't trivial to implement correctly). Do you intend to implement these yourself? If so, that alone might be useful as a separate library, with the ML library built on top of it.

(2) I've been thinking about a JS distributed computing library for a long time -- sort of like Folding@Home, but instead of having to download a program, you just visit a website and let JS do the crunching (Ajax will pull and push data chunks). With modern JS engines, this has become more of a reality. So to bring it back to your question -- why not try to abstract as much of the math+algorithms from as many distributed computing projects as possible, and then build a generic JS library for doing distributed computation. You could have each distributed computing project as a benchmark -- i.e., start by implementing SETI@Home using your library, then move on to Folding@Home. I guarantee you won't be bored... ;)

Good luck, and I'm really glad people are pushing the capabilities of JS these days.

On a lighter note, can you imagine the derisive laughter if you had suggested this 10 years ago? :)

catshirt · on Sept 19, 2010

this might be relevant to your (or our collective) interests- http://news.ycombinator.com/item?id=1645520 (MapRejuice - Distributed Client Side Computing (Node Knockout Entry))

thinkalone · on Sept 18, 2010

I don't know enough about the field, but I'm throwing out this link, since I saw it earlier this week - Brain JS http://harthur.github.com/brain/

tectonic · on Sept 18, 2010

Thanks for sharing, I had not seen that!

thinkalone · on Sept 18, 2010

No problem, I hope it helps you out or gives you some inspiration!

Since I don't know much about machine learning: what do you plan to build your JS library to do or support? Is it just for research, or are there ways to use it in a more everyday webapp?

Detrus · on Sept 18, 2010

Yes these would be useful to me. What would the performance be like compared to other options like Java? You'd have to use the new typed arrays to have any hope of comparable performance right?

This http://harthur.github.com/brain/ seems very slow to train.

sesqu · on Sept 18, 2010

It would not be immediately useful to me personally, but it is something I've been intending to build over the coming winter (for educational purposes and for interface experimentation).

tectonic · on Sept 18, 2010

Maybe we can collaborate.

sesqu · on Sept 18, 2010

Sure. Drop me an email if/when you get started.

waterside81 · on Sept 18, 2010

We're building out our REST API to allow you to create, train, (and re-train) your own SVM. In theory, you could use our API entirely on the client side, using JavaScript.

From my experience in seeing performance and the kind of tweaking we've had to do to be able score 10K documents/s, you need some nitty gritty C code that I don't think can run in a browser.

http://www.repustate.com

nl · on Sept 20, 2010

Assuming you mean Javascript-in-the-browser, then meh... (I'm sure that's not the answer you were looking for, but hear me out):

Why would this be useful? Machine learning generally needs two things that browsers aren't very good at dealing with: 1) Large amounts of data 2) Fast I/O to process that data.

Why would someone prefer to use a client library rather than a remote call to a high performance serverside library, which will give better results?

Having said that, there are a few very specialized areas where this might make sense. For example, a Javascript Haar classifier would be useful for machine vision in a browser.

tectonic · on Sept 18, 2010

Follow up: what would you do with clustering and classification on the client side? What about optimization?

Andi · on Sept 18, 2010

It's not only about the client side. With the rise of node.js, it will be more and more relevant for server-side, data-heavy processing.

bustamove · on Sept 18, 2010

You are right, there are many things that can be done, using machine learning, like music classification by genre,artist or predictions api ( mentioned in a general fashion on a purpose because you are limited to the datasets and problems you choose to work on ). If a Javascript library that works smoothly and makes these things work better can be given birth i think it is definitely worth it. And you are right about the node.js thing.

bustamove · on Sept 18, 2010

I also point you out to some interesting resources about concepts that you could easily implement in your library. Like the NGD ( Normalized google distance ), just an idea, to make smarter tag clouds ? http://www.complearn.org/

natahmed · on Oct 29, 2010

I'm looking for something like this to build on for image driven searching. If anyone knows of anything, please email me at ahmed.nat ATSIGN gmail.com

bustamove · on Sept 18, 2010

If you are planning to release this as a product, i doubt that it could gain traction, although the whole node.js thing makes me wonder whether everything is moving to the client, even heavy computational tasks as ML or AI problems. If it is a project just for the sake of it or for fun, then it would be cool to see your implementation.