Adding specific jar/package to be used with PySpark

Node_bootstrap should be helpful cases with and without 'pip install'
More info on node_bootstrap script:

If pip works:
Add following line to node_bootstrap script:
pip install <pkgname>

If pip doesnt work, this is what will work: - use node_bootstrap script for this as well:

wget link to download the package
<other setup steps for this package here>

Once above is done, and cluster is restarted, the package should be available to use in PySpark. 

Have more questions? Submit a request


Powered by Zendesk