Apache Spark Developer in Washington, DC at Booz Allen Hamilton Inc.

Date Posted: 8/8/2018

Job Snapshot

Job Description

Job Number: R0034347

Apache Spark Developer

Key Role:

Design, develop, test, and implement ETL pipelines to feed a data lake. Contribute to the Databricks Cluster or similar EMR cluster management efforts. Apply data mining techniques, performing statistical analysis, data preparation, or building high quality prediction systems integrated with Cloud enabled products. Plan and implement pilot efforts to demonstrate the ability of parallel data processing frameworks to replace or enhance existing data warehousing capabilities. Work independently and with team members in an agile and fast-paced development environment under the direction and supervision of a senior technical lead. Write documentation of new or existing programs to ensure effective communication and solve complex technical problems of critical importance to the organization's technical direction. Act as an expert in the field and lead the development and implementation of key technologies for the organization.

Basic Qualifications:

-3+ years of experience with Java, Spark, Scala, Hive, or ETL engineering

-2+ years of experience with processing, cleansing, and verifying data integrity for analysis

​-2+ years of experience with ad-hoc analysis and clear results presentation

-Experience with architecting and building data lakes, data marts, and data warehouses

-Experience with the integration of data from multiple data source

-Ability to obtain a security clearance

-BA or BS degree in Science, Technology, Engineering, or Mathematics

Additional Qualifications:

-3+ years of experience with Python or R

-3+ years of experience with data mining using current methods and tools

-3+ years of experience with data integration, data mining, Natural Language Processing, Hadoop platforms, or automating machine learning components

-2+ years of experience with AWS

-1+ years of experience with selecting features, building, and optimizing classifiers using machine learning techniques

-Experience with data visualizations tools, including PowerBI and Tableau

-Experience with NoSQL databases, including ElasticSearch, HBase, and MongoDB

-Experience with using AWS datastores, including RDS Postgres and DynamoDB

-Experience in processing Big Data with Hadoop, Spark, Scala, or MapReduce

-Knowledge of one or more of the following: Jira, Git, Terraform, Chef, or Jenkins

-AWS Certification


Applicants selected will be subject to a security investigation and may need to meet eligibility requirements for access to classified information.

We’re an EOE that empowers our people—no matter their race, color, religion, sex, gender identity, sexual orientation, national origin, disability, or veteran status—to fearlessly drive change.

Your Career is Waiting.

Get job alerts. Learn about new work and upcoming events. Share open roles with friends and colleagues.
Our Talent Network is your opportunity hub.

Get Answers and Access.

Need more information? Find it in our FAQs.

Application already in-process? Log in to keep going.