Big-data Solution Architect, Resume
Dearborn, MI
SUMMARY:
- 11+ years of total IT experience including 4 years of experience as Data Architect in designing and developing data platforms, Modeling data lakes, including Big - Data ecosystems Hadoop, Spark, Hbase, Hive, impala, Solr, Sqoop, Kafka, MR for clients in Finance and legal sectors.
- Hands on experience in evaluating and designing batch and real time data ingestion architectures.
- Experience in consulting with development, Business and enterprise architect team on defining technology road maps, and resolving complex data problems.
- Strong hands on experience in developing batch jobs using MapReduce, Spark SQL and developing real-time data ingestion pipeline using Spark Streaming, Kafka and NIFI.
- Experience in scheduling batch and Real time jobs using Oozie, Autosys and UNIX crontab.
- Hands on experience in writing SQL and Hive/Impala SQL quires.
- Hands on experience data parsing using ORC, Avro and Parquet format.
- Experience in importing and exporting data using Sqoop from/ to Relational Database such as Teradata and MySQL.
- Experience in Data Analysis and building training dataset for forecasting model.
- Experience in defining data Governances and data security policy for data warehouses.
- Excellent understanding of Hadoop Architectures, Azure, Pivotal cloud foundry platforms.
- Experience in Content search, Faceted search, relevance ranking using Solr and Endeca.
- Experience in JAVA/J2EE, SPRING, and RESTFUL service application developments.
- Expertise in developing Business analytics applications using Endeca ITL, Endeca Developer Studio Pipeline configuration, Endeca EQL and Endeca relevance ranking
- Having experience in Linux shell scripting and auto-sys job scheduling.
- Experience in tools like tableau, Microsoft Visio, Maven, Gradle, GitHub, Jenkins, PUTTY, Eclipse, and RTC.
- Good in communication and presentation skills, committed to co-operative work, sharing knowledge within the team and providing technical guidance to other team members as well.
- I have performed consulting roles during my career Data Architect, Project Lead, Software Engineer and Programmer.
TECHNICAL SKILLS:
Big Data Technology: MapReduce, Apache Spark, Hive, Hbase, Impala, HDFS, ORC, Avro, Parquet, Sqoop, Kafka, Nifi and ZooKeeper.
Software Tools: Eclipse 3.1, 3.5. STS, Jazz, Endeca Developer Studio, IBM Info Sphere Data-Stage, tableau, Ranger, Event-Hub, J-Meter, Maven, Gradle, Jenkins
Application/ Web Servers: Apache TOMCAT5.0., web logic 10.3.
Programming Languages: Java, J2EE, Hibernate, REST, Junit, Spring 3.0, Scala, Groovy script, SQL.
Presentation layer: JSP, JavaScript.
Search Engines: Endeca, SOLR
Hadoop Distributions: Cloudera, Hortonworks
PROFESSIONAL EXPERIENCE:
Big-Data Solution Architect
Confidential - Dearborn, MI
- Designed data ingestion architecture for Telemetry FORD car data and defined security policy and policy groups for HDFS and Hive.
- Design and developed Spark flexible data ingestion framework to replace MapReduce model.
- Developed Solr index to store real time data integrating with Hbase.
- Developed a proof of concept for improving Hive performance by enabling Hive LLAB with Bloom filters.
- Defined best practices for Hive Transactional tables, Spark jobs development and deployment.
- Consulting with development team, Business and Enterprise Architect team on defining technology road maps, and resolving technical issues.
- Design and developed telemetry data transformation pipeline for different sources.
- Architectural governance in data inbound and outbound.
Environment: Java8, Scala 2, Junit, Gradle, GitHub, Jenkins, Solr, Hadoop, MapReduce, Spark, Spark Streaming, Hive, Hbase, Impala, HDFS, ORC, Avro, Parquet, Sqoop, Kafka, Ranger, HDP 2.2 Nifi and ZooKeeper.
Sr. Hadoop/Spark Developer
Confidential - Charlotte, NC
- Design and developed an Enterprise Event Engine platform to persist Web activity, call centre activity and notification alert activity transactions using Kafka, hive and Hbase for predictive analytics and fraud transaction monitoring.
- Design and developed Apache Spark application to calculate monthly balance from billions of retirement balance transaction.
- Developed multiple MapReduce and Hive jobs for daily incremental/historical data processing.
- Developed Complex MapReduce jobs to de-normalize different subject area transaction into single transaction to improve online channel query performance.
- Installed, configured and developed various NIFI Processor groups’ to import/export data from/to web services, Splunk, Kafka, HDFS and Hive.
- Design and developed impala/hive schema, Big-Data ETL batch processing, and Endeca search development.
- Developed Cloudera Solr Collection for Global and Dimensional search.
- Extracted and exported the data from/into Teradata/Hadoop using the Sqoop.
- Design and developed Solr/impala java application API to retrieve retirement financial information from Solr index and impala data stores for online channel reporting and search requirements.
- Design and developed Java Hbase aggregator service API to join master/transactional data and aggregate transaction data in memory for online channel’s display to improve data latency.
- Designed and implemented Apache Spark - Streaming of concept application to capture operational data in batch and rear real-time model.
- Prepared and maintained technology roadmaps in adherence to business objectives.
- Suggested and implemented application enhancements for functionality.
- Developed and implemented Oracle Endeca products as per client requirements.
- Responsible project implementation plan, execution and process documentation.
- Participated in integration of Oracle technologies with existing enterprise architecture.
- Programmed and executed Java code and other services
- Developed Endeca Java Interface service layer for Web Team Interaction using java Spring.
- Involved data level support to Innovation groups.
Environment: Java1.7, Scala 2, Spring Data, Junit, Solr, Hadoop, MapReduce, Spark, Spark Streaming, Hive, Hbase, Impala, HDFS, Avro, Parquet, Sqoop, Kafka, Nifi and ZooKeeper.
Java / Endeca Search Developer
Confidential - Charlotte, NC
- Understand Business requirements; prepare technical design documentation.
- Data point analysis from portal pages and Design Endeca file layouts for index.
- Developed Endeca data pipelines to index delimited data.
- Configured Endeca Dimensional fields and Define Relevance ranking modules.
- Designed WADL and developed RESTFUL web service API to access Endeca MDEX data’s.’
- Developed various web interfaces using spring frameworks, REST web services, Jsp, and Java scripts.
- Developed JUNIT unit case for application module,
- Used SoapUI, TCP monitor and JMeter tools for validation services and measure performances.
- Developed various useable codes applying various design.
- Used WebLogic and tomcat to deploy applications on local and development environments of the application
- Used JavaScript to perform client side validations and spring AOP for server-side validations.
- Contact Code reviews as per standard and assign review comments to respective owners.
- Production Implementation and monitoring Endeca baseline updates.
Environment: Java 1.6, Spring MVC, Spring-WS, REST, web logic 10.3, web logic portal, Soap UI, TCP Monitor Endeca MDEX 6.1.4, Platform Services 6.0.1 and Endeca Developer studio.
Java / J2EE Developer
Confidential
- Design and developed applications interfaces using core Java Swing, spring, Struts.
- Participated in integration PDF generation Technology.
- Extensively used the JDBC Prepared Statement to embed the SQL queries into the java code.
- Responsible for Planning, Design, Configuration, Development and deployment.
- Suggested and implemented application upgrades activity.
- Enhancing the existing functionality of JDBC data server module.
- Responsible for integrating team code and mentoring teams.
- Designed generic activity based application model to use different court fees.
Environment: Window2000, Java, Swing, Struts, JSP, Java Script, RMI, JDBC, DB2.