Java Developer Resume
San Antonio, TX
SUMMARY
- Adept and experienced Hadoop developer with over 8+ years of experience in programming world and 5 years of proficiency in Hadoop ecosystem and Bigdata systems
- In - depth experience and solid subjective knowledge of HDFS, Map Reduce, Hive, Pig, Sqoop, Yarn/MRv2, Spark, Kafka, Impala, HBase and Oozie.
- Currently working on Spark and Spark Streaming frameworks extensively using Scala as the main programming language.
- Used Spark Data frames, Spark-SQL and RDD API of Spark for performing various data transformations and dataset building.
- Extensively worked on Spark Streaming and Apache Kafka to fetch live stream data.
- Has strong fundamental understanding of distributed computing and distributed storage concepts for highly scalable data engineering.
- Worked with Pig and Hive and developed custom UDF’s for building various datasets.
- Worked on MapReduce framework using Java programming language extensively.
- Strong experience troubleshooting and performance fine-tuning spark, MapReduce and hive applications.
- Worked with Click Stream Data extensively for creating various behavioral patterns of the visitors and allowing data science team to run various predictive models.
- Worked on No-SQL data-stores, primarily HBase using the Java API of HBase and Hive Integration.
- Extensively worked on data migrations from diversified databases into HDFS and Hive using Sqoop.
- Implemented Dynamic Partitions and Buckets in HIVE for efficient data access.
- Significant experience in working with cloud environment like AMAZON WEB SERVICES (AWS) EC2 and S3.
- Strong expertise in Unix shell script programming.
- Expertise in creating Shell-Scripts, Regular Expression and Cron Automation.
- Dexterous in visualizing data using Tableau, QlikView, MicroStrategy and MS Excel.
- Exposure to Mesos, Zookeeper cluster environment for application deployments and dock containers
- Knowledge on Enterprise Data Warehouse (EDW) architecture and various data modeling concepts like star schema, snowflake schema and Teradata.
- Highly proficient in Scala programming Knowledge
- Experience with web technologies which include HTML, CSS, Java Script, Ajax, JSON and frameworks like J2EE, Angular JS, Spring.
- Good Knowledge in REST Webservices, SOAP programming, WSDL, XML parsers like SAX, DOM, AngularJS, Responsive design/Bootstrap.
- Acquaintance with Agile and Waterfall methodologies. Responsible for handling several clients facing meetings with great communication skills.
- Good experience in Customer support role as Training, resolving production issues based on priority.
TECHNICAL SKILLS
Hadoop Ecosystems: HDFS, Map Reduce, Pig, Hive, Sqoop, Flume, YARN, Oozie, Zookeeper, Impala, Spark, Spark SQL, Spark Streaming, Storm, HUE, SOLR
Languages: C, C++, Java, Scala, Python, Swift, C#, SQL, PL/SQL
Frameworks: J2EE, Spring, Hibernate, Angular JS
Web Technologies: HTML, CSS, Java script, jQuery, Ajax, XML, WSDL, SOAP, REST API
No-SQL: HBase, Cassandra, Mongo DB
Security: Kerberos, OAuth
Cluster Management and Monitoring: Cloudera Manager, Hortonworks Ambari, Apache Mesos
Relational Databases: Oracle 11g, MySQL, SQL-Server, Teradata
Development Tools: Eclipse, NetBeans, Visual Studio, IntelliJ IDEA, XCode
Build Tools: ANT, Maven, sbt, Jenkins
Application Server: Tomcat 6.0, WebSphere7.0
Business Intelligence Tools: Tableau, Informatica, Splunk, Qlik View
Version Control: GitHub, Bit Bucket, SVN
PROFESSIONAL EXPERIENCE
Confidential, Atlanta, GA
Sr. Hadoop Developer
Responsibilities:
- Ingested Click-Stream data from FTP servers and S3 buckets using custom Input Adaptors.
- Designed and developed Spark jobs to enrich the click stream data.
- Implemented Spark jobs using Scala, used Spark SQL to access hive tables into spark for faster processing of data.
- Involved in performance tuning of Spark jobs using cache and using complete advantage of cluster environment.
- Worked with Data-science team to gather requirements for data mining projects.
- Developed Kafka Producer and Spark Streaming consumer for working with Live Click Stream feeds.
- Worked on different file formats (PARQUET, TEXTFILE) and different compression codecs (GZIP, SNAPPY, LZO).
- Written complex Hive queries involving external dynamic partitioned on Hive tables which stores rolling window time-period user viewing history.
- Worked with data science team to build various predictive models with Spark MLLIB.
- Experience in troubleshooting various Spark applications using spark-shell, spark-submit.
- Good experience in writing Map Reduce programs in Java on MRv2 / YARN environment.
- Developed java code to generate, compare and merge Avro schema files.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs,Scala and Python.
- Designed and developed External and Managed Hive Tables with data formats such as Text, Avro, Sequence File, RC, ORC, Parquet.
- Implemented Spark RDD transformations, actions to migrate Map reduce algorithms.
- Implemented Sqoop job to perform import / incremental import of data from any relational tables into Hadoop in different formats such as text, Avro and Sequence into Hive table.
- Developed ETL scripts for Data acquisition and Transformation using Talend.
- Good hands on experience in writing HQL statements as per the requirement.
- Worked extensively with importing metadata into Hive and migrated existing tables and applications to work on Hive and AWS cloud.
- Involved in designing and developing tables in HBase and storing aggregated data from Hive table.
- Used cloud computing on the multi-node cluster and deployed Hadoop application on cloud AWS S3 and used Elastic Map Reduce(EMR) to run a Map-Reduce.
- Responsible in analysis, design, testing phases and responsible for documenting technical specifications.
- Coordinated effectively with offshore team and managed project deliverable on time.
- Used Impala and Tableau to create various reporting dashboards.
Environment: Spark, Hive, Impala, Sqoop, HBase, Tableau, Scala, Talend, Eclipse, YARN, Oozie, Java, Cloudera Distro, Kerberos.
Confidential, St Louis, MO
Sr. Hadoop Developer
Responsibilities:
- Loaded the data from Teradata to HDFS using Teradata Hadoop connectors.
- Import the data from different sources like HDFS/HBase into Spark RDD.
- Issued SQL queries via Impala to process the data stored in HBase and HDFS.
- Used Spark - Cassandra Connector to load data to and from Cassandra.
- Used Fast Load for loading into empty tables.
- Wrote python scripts to parse XML documents and load the data in database.
- Good Experience with Amazon AWS for accessing Hadoop cluster components.
- Experienced in transferring data from different data sources into HDFS systems using Kafka producers, consumers, and Kafka brokers.
- Implemented modules using Core Java APIs, Java collection and integrating the modules.
- Loading data from different source (database & files) into Hive using Talend Tool.
- Used Oozie and Zookeeper operational services for coordinating cluster and scheduling workflows
- Knowledge in developing Nifi flow prototype for data ingestion in HDFS.
- Good Experience in building highly scalable Bigdata solutions using Hadoop and Distributions like Horton works.
- Used Oozie to orchestrate the MapReduce jobs and worked with HCatalog to open up access to Hive's Metadata.
- Developed Custom Input Formats in MapReduce jobs to handle custom file formats and to convert them into key-value pairs.
- Responsible for building scalable distributed data solutions using Hadoop.
- Written custom writable classes for Hadoop Serialization and De-serialization of time series Tuples.
- Developed Sqoop import Scripts for importing reference data from Netezza.
- Used Shell scripting for Jenkins job automation with Talend.
- Created Hive external tables on the map reduce output before partitioning, bucketing is applied on top of it.
- Comprehensive Knowledge and experience in process improvement, normalization/de-normalization, data extraction, data cleansing, Scrum data manipulation.
- Worked with Data Governance team to ensure metadata management and best practices
- Implemented Daily Cron - jobs that automate parallel tasks of loading the data into HDFS and pre -processing with Pig using Oozie coordinator jobs.
- Cluster coordination services through Zookeeper.
- Worked with BI teams in generating reports and designing ETL workflows on Tableau.
Environment: Apache Hadoop, Hive, Scala, PIG, HDFS, Horton works, Java Map-Reduce, Maven, GIT, Jenkins, Eclipse, Oozie, Sqoop, Flume, SOLR, Nifi, OAuth, Teradata, FastLoad, Multi Load, Netezza, Zookeeper, Cloudera.
Confidential, Chicago, IL
Hadoop Developer
Responsibilities:
- Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
- Involved in planning and implementation of an additional 10 node Hadoop clusters for data warehousing, historical data storage in HBase and sampling reports.
- Used Sqoop extensively to import data from RDBMS sources into HDFS.
- Performed Data transformations, cleaning, and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
- Developed Pig UDF’s to pre-process data for analysis.
- Worked with business teams and created Hive queries for ad hoc process.
- Created reports by extracting transformed data from Composite data warehouse.
- Responsible for creating Hive tables, partitions, loading data and writing Hive queries.
- Created Pig Latin scripts to sort, group, join, filter the enterprise wise data.
- Worked on Oozie to automate job flows.
- Handled Avro and JSON data in Hive using Hive SerDe
- Integrated Elastic search and implemented dynamic faceted search
- Created MapReduce programs to handle semi/unstructured data like XML, JSON, AVRO data files and sequence files for log files
- Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Worked in Agile Environment.
- Communicated effectively and made sure that business problem is solved.
- Creating files and tuned the SQL queries in Hive utilizing HUE.
- Created the Hive external tables using Accumulo connector.
- Generated summary reports utilizing Hive and Pig and exported these results via Sqoop for Business reporting and intelligence analysis.
Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, Java, Eclipse, SQL Server, Apache Flume, Shell Scripting, Zookeeper.
Confidential, Kansas City, MO
Hadoop Developer
Responsibilities:
- Developed complex MapReduce jobs in Java to perform data extraction, aggregation, transformation and performed rule checks on multiple file formats like XML, JSON, CSV.
- Implemented schedulers on the Job tracker to share resources of the cluster for the MapReduce jobs given by cluster.
- Used Sqoop to import and export the data from HDFS.
- Moved data from HDFS to Cassandra using MapReduce and BulkOutputFormat class.
- Participated with the admin team in designing and migrating the cluster from Cloudera to HDP.
- Developed some helper class for abstracting Cassandra cluster connection act as core toolkit.
- Involved in Agile methodologies, daily Scrum meetings, Sprint planning.
- Wrote Query Mappers and MQ Experience in Junit test Cases.
- Created dashboards in Tableau to create meaningful metrics for decision making.
- Involved in designing the next generation data architecture for the unstructured and semi structured data.
- Worked with team which analyzes system failures, identifying the root cause and taking necessary action
Environment: HDFS, MapReduce, Cassandra, Pig, Hive, Sqoop, Maven, Log4j, Junit, Tableau
Confidential, San Antonio, TX
Java Developer
Responsibilities:
- Involved in client meetings to gather the System requirements.
- Generated Use case, class and sequence diagrams using Rational Rose.
- Written Java Script, HTML, CSS, Servlets and JSP for designing GUI of the application.
- Strong hands-on knowledge of Core JAVA, Web-based Applications and OOPS concepts.
- Developed the application using Agile/Scrum methodology which involves daily stand ups.
- Developed Server-Side technologies using Spring, Hibernate, Servlets/JSP, Multi-threading.
- Extensively worked with the retrieval and manipulation of data from the Oracle database by writing queries using SQL and PL/SQL.
- Implemented Persistence layer using Hibernate that uses the POJO’s to represent the persistence database.
- Used JDBC to connect the J2EE server with the relational database.
- Involved on development of RESTFUL web services using JAX-RS in spring based project.
- Web application development by setting up an environment, configuring an application and Web Logic Application Server.
- Implemented back-end service using spring annotations to retrieve user data information from the database.
- Involved in writing AJAX scripts for the requests to process quickly.
- Used Dependency injection feature and AOP features of Spring
- Used and implemented Unit test cases using Junit Framework.
- Issue findings from the production system and providing the information to the app support team.
- Involved in the Bug Traise meetings in QA, UAT teams.
Environment: Spring, Hibernate, CSS, AJAX, HTML, Java Script, Rational Rose, UML, Junit, Servlets, JDBC, RESTful API, JSF, JSP, Oracle, SQL, PL/SQL.
Confidential
Java Developer
Responsibilities:
- Involved in various SDLC phases like Requirements gathering and analysis, Design, Development and Testing.
- Developed the business methods as per the IBM Rational Rose UML Model.
- Extensively used Core Java, Servlets, JSP and XML.
- Used HQL, Native SQL and Criteria programming to retrieve from the database.
- Understanding New Crs and Service requests and giving Development Estimation Time and designing the database according to the business requirement.
- Writing client side and server side validations.
- Writing JSP, Spring Controllers, DAO, Service classes and writing business logic, CRUD screens
- Used AJAX for faster interactive front-end.
- Designed and implemented the architecture for the project using OOAD, UML design patterns.
- Worked with the Testing team in creating new test cases and created the use cases for the module before the testing phase.
- Provide support to resolve performance testing issues, profiling and cache mechanism.
- Developed DAO classes using Hibernate framework for persistence management and involved in integrating the frameworks for the project.
- Worked with Rational Application Developer as development Environment.
- Designed error logging flow and error handling flow.
- Used Apache log4j Logging framework for logging.
- Followed Scrum development cycle for streamline processing with iterative and incremental development.
- Perform code reviews to ensure consistency to style standards and code quality
Environment: Java, Spring MVC, Hibernate, Oracle, Java Script, jQuery, AJAX, Rational Application Developer (RAD), log4j, HTML, CSS.