Sr. Big Data Consultant/full Stack Developer Resume
SUMMARY
- Having overall Eleven Plus years of working experience on various IT Systems & application using open source technologies with good understanding of Big Data Technologies, Machine Learning, SDLC and Project Management
- Experience in P&C Insurance, Health Insurance, Healthcare products and Aerospace domains.
- Experience in designing and developing application on Hadoop Stack using Java, Spark, PIG, HIVE, Mongo and Elastic Search
- Strong knowledge on implementation of data processing on SPARK CORE using SPARK SQL, MLib and Spark streaming
- Experience in statistical analysis (descriptive, diagnostic, predictive and prescriptive analytics) and various Machine Learning algorithms using Excel, R and Weka.
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Map Reduce and YARN concepts.
- Experience in implementation of machine learning programs in Python, R and Scala
- Hands - on experience on various Hadoop distributions - Cloudera, Hortonworks and MapR
- Experience in implementing POCs for Internet of Things (IoT) solutions (Logistic and Home Automation during Hackathons) on Arduino boards
- Strong experience in full System Development Life Cycle (Analysis, Design, Development, Testing, Deployment and Support) in waterfall and Agile methodologies
- Strong knowledge on python packages - Scikit-learn, NLTK
- Strong knowledge on open source search technologies - Elastic Search, SOLR and Lucene
- Excellent analytic, logical, programming and problem solving skills
- Experience in developing SOAP and REST based Web Services design development.
- A proven resource in defining systems strategy, developing systems requirements, designing and prototyping, testing, training, defining support procedures, and implementing practical business solutions under multiple deadlines
- Exposure to project / people management in Captive / Service environment and leading virtual teams across globe.
- Passionate, flexible, collaborative, works independently, sets own goals and has a “can-do” positive attitude
Academics & Certifications:
- Bachelor’s Degree in Computer Science and Engineering from JNTU
- Cloudera Certified Developer for Apache Hadoop (CCDH)
- Big Data Analytics and Optimization from Carnegie Mellon University
- Certified for “Mongo DB for DBAs” from MongoDB University
- ITIL 2011 V3 certified professional
TECHNICAL SKILLS
Big Data: Hadoop Stack: HDFS, MRV2(YARN), SQOOP, Flume, PIG and Hive SPARK: Spark Core, Spark Streaming, Spark SQL and MLib
Machine Learning: Prediction, Classification, Clustering and Time series algorithms
NoSQL: HBase and MongoDB
Programming Language: Java, Python, R and Scala
Analytics Tools: RStudio, Weka, Excel
RDBMS: MySQL, DB2 and Oracle
Reporting: Tableau, QlikView, D3JS and Excel
PROFESSIONAL EXPERIENCE
Confidential
Sr. Big Data Consultant/Full Stack Developer
Responsibilities:
- Calculate Hadoop Storage for the project and deployed cluster in development environment.
- Installation of various Hadoop ecosystem components.
- Develop architecture for BlueSticker reporting and analytics.
- Created pipelines for data ingestion and from various channels, through the scripts written in PIG & Java
- Data is loaded into elastic search and mapping are indexing is created for quick retrieval.
- Front end is developed in AngularJS and server side scripting is handled through Node JS.
- Graphs are created in D3 for better visualizations.
- Performed Researching and deploying new tools, frameworks and patterns to build a sustainable big data platform.
- Worked with our Engineering partners to migrate approved Research artifacts into our Production environment
- Supported Research artifacts that have been migrated into Production
Environment: Apache Hadoop Native, AngularJS, NodeJS, D3JS, Elastic Search, Kibana, PIG, Rest Services
Confidential
Technology Specialist - Data Scientist
Responsibilities:
- Designing and developing data ingestion, aggregation and advanced analytics in Hadoop environment from MySql and Oracle databases
- Stream raw events of ASDI data from many sources using KAFKA
- Identify Real-time KPIs, Statistical Analytics, Baselining and Notification for better Actions and Decisions
- Writing programs in scala to run SPARK MLlib, for analytics.
- Installations of eco system components on Hadoop cluster
- Implementation if data analytics using machine learning algorithms
- Writing pig scripts for ETL jobs, to acquire data from multiple sources and convert them into uniform format
- Integrate Qlik view with hive for quick visualizations.
- Performed Researching and deploying new tools, frameworks and patterns to build a sustainable big data platform.
- Worked with our Engineering partners to migrate approved Research artifacts into our Production environment
- Supported Research artifacts that have been migrated into Production
- Involved in Designing, building, installing, configuring and supporting Hadoop.
- Translated complex functional and technical requirements into detailed design.
- Performed analysis of vast data stores and uncovers insights.
Environment: Hortonworks HDP 2.3.1, REST Services, Hadoop, Oracle, MySQL, SPARK, SCALA, XML and JSON.
Confidential
Sr. Hadoop Developer
Responsibilities:
- Actively involved in designing and developing data ingestion, aggregation, integration and advanced analytics in Hadoop, including Map Reduce, Pig/Hive and other NoSQL database
- Writing of MR and SQOOP programs to ingest data from Mainframes and MySQL databases
- Proactively engage with product and development teams to define next generation product features, specifications and requirements, and research on existing web technologies to design and implement these requirements
- Performed data formatting involves cleaning up the data.
- Designed and prepared technical specifications and guidelines.
- Developed and maintained high performance, high-available, scalable data processing software frameworks and data models.
- Validation of data integrity by running different API in ElasticSearch
- Adopted best engineering practices and develop high quality maintainable code.
- Contributed to Infrastructure planning and security definitions.
- Evaluated alternatives, process requirements, and collaborate with rest of the team for effective solutions.
- Delivered projects on-time and to specification with quality.
- Worked with the Architects, BA and users to define and ingest data from various data sources, model the data as appropriate and process them efficiently.
Environment: HDFS, Map Reduce, Pig, Java, Hive, Mainframes, MySQL and ElasticSearch
Confidential
Technical Lead
Responsibilities:
- Acquire, clean, integrate, analyze and interpret disparate datasets using a variety of geospatial and statistical data analysis and data visualization methodologies, reporting and authoring findings where appropriate.
- Accountable for designing and developing data ingestion, aggregation, integration and advanced analytics in Hadoop, including Map Reduce, Pig/Hive and other NoSQL databases
- Performed data formatting involves cleaning up the data.
- Designed and prepared technical specifications and guidelines.
- Developed and maintained high performance, high-available, scalable data processing software frameworks and data models.
- Adopted best engineering practices and develop high quality maintainable code.
- Contributed to Infrastructure planning and security definitions.
- Evaluated alternatives, process requirements, and collaborate with rest of the team for effective solutions.
- Delivered projects on-time and to specification with quality.
- Worked with the Architects, BA and users to define and ingest data from various data sources, model the data as appropriate and process them efficiently.
- Strengthen relationship with all teams and ensure proper stakeholder management.
Environment: TOPS, COMET, COSMOS, CLAIMS HIGHWAY and PHS NICE. Mainframes DB2, ALM
Confidential
Performance Analyst
Responsibilities:
- Data acquisition (streaming and batch) and Data aggregation from disparate systems
- Analyzing data using Machine learning, statistical analysis, and time series analysis
- Exposing data and results to consumers
- Near-real time data science to inform and drive the cloud storage product
- Analyze and document findings, providing recommendations based on issues found and potential operational impacts
- Applying repeatable methodology towards analyzing software systems for performance bottlenecks and developing test plans.
- Participate in product roadmap development, recommending improvements to core product offerings based on performance analysis findings
Environment: HP LoadRunner 9.1, Rally, Quality Center 9.2, VB. Net, PHP, Windows 2003 Server, ASP.Net, My SQL.
Confidential
Guidewire Analyst
Responsibilities:
- Acquire, clean, integrate, analyze and interpret disparate datasets using a variety of geospatial and statistical data analysis and data visualization methodologies, reporting and authoring findings where appropriate.
- Collaborate with business analysts and data scientists in eliciting and articulating solution requirements
- Design scalable data analytics platform and solutions
- Make architectural decisions for the data analytics platform and solutions.
- Assess, benchmark and select data analytics technologies
- Recommends process improvements and works with process owners to determine and manage implementation strategy and roll-out when there were gaps I have identified in package delivery.
- Analyze the technical and business requirements, including functional and non-functional requirements, to develop a systems solution.
Environment: Java, Team track, PVCS, Visio, Excel, Test Director, Mainframe CICS - DB2 and WebSphere
Confidential
Senior Analyst/SME
Responsibilities:
- Experienced in methods of Data profiling and Data Cleansing
- Collects, tracks and reports configuration management metrics and briefs senior leadership on status of CM within the program
- Support other development/QA staff with specific subject matter expertise
- Analyze data and do Root cause analysis
- Working on RDBMS and ETL tools to build new attribute to data
- Prepare and present data flow diagram and data quality standards and metrics.
- Articulate technical specifications, business requirements and systems specifications
- Translate highly complex technical issues to non-technical audience through "How To" documentations
Environment: Java, Visio, Excel, Test Director, Mainframe CICS - DB2 and WebSphere