The road so far….

August 29, 2014

Horizontal Scaling using Parallel file systems

Filed under: Linux — Rahul Sharma @ 8:45 am

If you are doing good business then you need to be prepared for scaling your services. Stateless services are easy to scale, but if there are state-full ones then you need to think sometime before you start trying to scale. Now we had one service that was building state on its local disc, like a invoice generation service where your customers can generate invoices for their customers in all kind of formats eg PDF, Excel etc. While it was easy to vertically scale the service, just shutdown the EC2 instance, change the type and start-up, but there is a ceiling to it. How can we scale such a service  horizontally without rewriting our component ?  

Basically when we are talking about scaling our file system based service basically we need to scale our file system. There are different ways to scale a file systems, in the simplest case we can share a drive using NFS. Sharing is easy, there are load of tutorials on that, but there are issues with the same. NFS locks the file system while someone is writing to the disc. It queues all the writes and the request waits until the write operation is complete. Thus there is complete degradation in performance.

To allow a NFS to service multiple write requests  pNFS was developed. There are issues with its support in various Linux distributions. I can not get that service working on our Ubuntu platform,I believe it is not yet supported, thus can not comment on the same. The next solution around the same issue is to use a distributed file system like HDFS but that would involve a re-write of all commands and that’s not the preference.

While we were running out of ideas we came to know about clustered file systems. We tried two of them namely Gluster and Fraunhofer. We first tried Gluster, it is completely open source, doest not require any special hardware and easy to setup. We were to a flying start, the service could be scaled horizontally on many more machines. But the service soon stopped responding. We realized that we are creating small files and there are issues in Gluster when we do small writes.

Next we tried Fraunhofer, it was also easy to setup and did not require any special hardware.The service was back on track, we tested it for couple of days on our testing bed and it worked like a charm. But on one day we realized that it is no longer responding.After trying out couple of things we realized that we ran out of inodes on the drive. Thus we deleted a few files and were back on track. we also realized that the inodes are note being reported  using the “df -ih” command. They are always zero, now that’s a new problem. We still haven’t found solution for the issue. For now we have formatted a larger sick with smaller block size which gives us large number of inodes to work with.

Ref : Fraunhofer vs Gluster by  Harry Mangalam

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: