Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS support #7

Open
everpeace opened this issue Feb 7, 2014 · 6 comments
Open

HDFS support #7

everpeace opened this issue Feb 7, 2014 · 6 comments

Comments

@everpeace
Copy link
Owner

  • Where should namenode be running? master1?
  • datanode probably should be running at all slaves.
@24601
Copy link

24601 commented Apr 25, 2014

The way I've added HDFS support into this:

  1. Used my fork of hadoop_cookbook (https://github.com/24601/hadoop_cookbook - have a pull request into the original to merge in the very small change I made for this cookbook to support Ubuntu 13.04)

  2. Cluster configured for 2 masters + 3 slaves as follows:

-Masters are NameNode in HA config with auto-failover (might as well while we're at it, I figure...)
-JournalNodes & DataNodes on Masters + Slaves (maybe no need for DataNode on masters, and probably no need for JournalNodes on master, either? Figured couldn't hurt for now and easy to not do it later on)
-HDFS uses ZK quorum established by vagrant-mesos

  1. HA uses sshfence, using existing key management provided in vagrant-mesos

Once I clean up things (like remove my AWS creds from cluster.yaml) I can fork and create PR if you want...

To do this required some appreciable changes to the Vagrantfile for multinodes to ensure the HDFS configuration was inserted into the chef.json object (I am NOT a ruby programmer, probably a better/more robust way to do it than I did, but my way works...) and just adding the hadoop cookbook into the Berkshelf file.

@everpeace
Copy link
Owner Author

Thank you @24601 !!

Has your PR already been merged??

Yes, I agree with you. I think it would be good that JournalNode and DataNode are only on slaves.

I'm really appreciated to your contributions and I'm happy to review your changes on chef.json.
I can't wait for your PR!!

@24601
Copy link

24601 commented Apr 27, 2014

@everpeace, thanks for the quick reply! Happy to help and hope my contribution is helpful, I'll be cleaning up the code and will submit a PR soon. Here are a few answers before that:

  1. Yes, it looks like https://github.com/continuuity/hadoop_cookbook has pulled in my changes (along with some of their own enhancements) to support ubuntu 13.04 and even 14.04, but I think (in my testing) Mesos doesn't run so well on 14.04 yet (things broke, not sure if it was easy stuff to fix, but didn't even bother as I found no need to move to 14.04 yet*).

  2. I'll modify so JN and DN run on slaves only.

Still working on original project that this stuff was done for, will clean up and submit PR once that's done!

*Uh, I just kinda take that back, 13.04 is already EOL'ed, could just take a step back to 12.04 LTS which has good support, but I'd rather figure out the leap forward to 14.04 while I'm at it...this is a bit of a separate issue, but will likely work on it and might just throw all my changes into one PR, I know the hadoop cookbook works with 14.04 well, al beit officially unsupported.

@24601
Copy link

24601 commented Apr 27, 2014

@everpeace , making the changes/doing the clean up as discussed above to include HDFS support and move things to Ubuntu 14.04 LTS, not ready for a PR yet, but if you want, changes being made and occasionally synced with my fork here:

https://github.com/24601/vagrant-mesos

Feel free to make suggestions/comments, like I said, I'm not a ruby programmer or even too proficient with vagrant, but know enough to bumble-F my way through this to get it working as part of a larger project and am happy to share my work, even if it's not the greatest.

@everpeace
Copy link
Owner Author

I'm greatly appreciated your contribution @24601 again! I'm not so proficient in HDFS actually. So I'm really happy that you help!

I've watched your Vagrantfile and several comments. After your clean up, I expect that

  • It depends on my cookbook. If some changes was required, I would like you to make PR in advance.
  • It doesn't use fixed ip addresses. you can access all information configured by user via ninfos or ninfo.
  • NameNode and JournalNode runs on all master nodes.

About Ubuntu, I'm not so heavy user of it. I think vagrant-mesos should support mesos-docker executor. And, I understand we have to kernel upgrade if we used 12.04 right?? So, If mesos works properly in 14.04, I'm fine that we move to 14.04, I think.

@theclaymethod
Copy link

One of the problems I've had with HDFS is that it requires absolute IP addresses. For some reason, a lot of the Hadoop ecosystem doesn't play well with relative IPs. I'm not sure if this has been fixed.

But HDFS would be very helpful. I'm not a Chef expert so I've installed HDFS manually, along with Spark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants