Example case study: The user uploads a CSV file containing a years worth of their bank statement entries. We want to parse the file, categorise each entry and calculate cumulative values for each category so that we can store the newly categorised statement in a db and display spending analysis to the user.
The entries are categorised by matching strings in the descriptions. There are many categories and many entries and it takes a fair amount of time to process.
In our node.js server, we can happily free up the event loop whilst waiting for network responses and so on, but if there is any data crunching or similar processing, the server will be blocked from responding to requests, and this seems unavoidable.
Traditionally, the CSV file would be passed to the server, the server would process, save in db, and send back the output of the processing.
It seems to make sense in our single threaded node.js server that this processing is handled by the browser, and the output displayed and sent to server to be stored. Of course the client will have to wait while this is done, but their processing will not be preventing the server from responding to requests from other clients.
I'm interested to see if anyone has had experience build apps using this model.
So, the question is.. are there any issues in getting browsers rather than the server to handle, wherever possible, any processing that will block the event loop? Is this a good/sensible/viable approach to node.js application development?
Although perfectly possible, simply shifting the processing to the client machine does not solve the basic problem.
Now the client's event loop is blocked, preventing the user from interacting with the browser. Browsers tend to detect this problem and stop execution of the page's script altogether. Something your users will certainly hate.
There is no way around either delegating or splitting up the work-load. Using a second process (for example a 2nd node instance) for doing the number crunching server-side has the added benefit of allowing the operating system to use a 2nd CPU core. Ideally you run as many Node instances as you have CPU cores in the server and balance your work-load between them. Have a look at the diode module for some inspiration on how to implement multi-process communication in node.