Intuit Data Engineering And Analytics Resume Profile
Profile
Senior Developer, Business/Data Analyst that is ready to design, create, and implement an ETL Data movement application for a large Data warehouse and Systems migration / upgrades. I have strong experience using a variety of databases and ETL tools along with experience in the Information Technology, in an agile environment with excellent communication, interpersonal skills and leadership qualities.
Professional Summary
- Experience in ETL Development, Administration and Data Warehousing. Currently focused on delivering solutions in the Big Data area using Hadoop and its ecosystem.
- Design and develop end-to-end Hadoop data ingestion process. Sufficiently decompose and transform the NoSQL, Binary, Sequential Flat file or XML data into Big Data platform HDFS/Hive, Vertica, Netezza through Informatica, MapReduce and Hive.
- Good knowledge in planning, installing, configuring, maintaining, and monitoring on Cloudera Hadoop Distributions.
- Created Hive tables, Partitioning loading with data and writing hive queries which will run internally in map reduce.
- Implemented merge load type ETL solution, to load incremental data into HDFS/Hive
- Installed and upgraded of Informatica PowerCenter 8.6 to PowerCenter 9.x GRID, HA, DR and VERITAS File System on all environments.
- Experience with ETL Operational Support and maintenance projects and run various tests such as unit, integration, mappings and workflows.
- Experience with of MPP Massively Parallel Processing data warehouse architecture and performed performance analysis, evaluating the database schema design, table organization, Optimization and other features for Netezza and Teradata.
- Experience using the Teradata utilities SQL Assistant, B-TEQ, Fast Load, Multi Load, Fast Export, TPUMP and Unix Shell scripting.
- Familiar with designing the application data model and identifying the Primary Index, Secondary Index in Teradata for efficient storage and retrieval.
- Involved in doing large scale migration of the data warehouse from oracle to Netezza, using Informatica as the ETL tool along with Netezza bulk loader connection or Power Exchange for Netezza.
- Strong knowledge in dimensional data modeling techniques, E-R modeling, multi-dimensional database schemas like star schema and snowflake schema, normalization forms and handling slowly changing dimensions.
- Extensive experience working with Parallel jobs, Troubleshooting and Performance tuning.
- Experience with auto finance, healthcare, insurance, marketing, e-commerce and advertising data.
- Rapid accommodation to emergent technologies, able to quickly learn new technologies and proprietary.
- Managed and coordinated with system support, development team, and vendors.
- Experience and working knowledge of Windows, Mac OS X, UNIX, Linux and IBM AIX.
Technical Skills
- Specialization Hadoop, Hive, Informatica, Oracle, Netezza, Teradata, PL/SQL, Unix Shell scripting, Python.
- Data Ingestion Kafka, Flume, Sqoop
- Hadoop Ecosystem Hive, Pig, Sqoop, Oozie, Flume, Zookeeper, HCatalog and Hue
- ETL Informatica, OHADI, OBIEE, HDI, HDM, ODI, Diyotta
- Functional Business Requirements Analysis and Process mapping
- Tools SourceTree, Jenkins, Toad, Teradata SQL Assistant, Aginity, Erwin, SwisSQL, Undraleu, DVO, HP ALM, Cache Monitor, Squirrel SQL Client
- OS Red Hat Linux, IBM AIX, UNIX, Mac OS X and Windows
- Databases Oracle Exadata 11g, Netezza 6/7, Vertica 7, Teradata 13, SQL-Server, Intersystem's Cache Database, DB2, Informix and MySQL
- Agile Mgmt. JIRA, Trello, VersionOne
- Reporting BOBJ, Cognos
- Others ISC 2.0, VxCFS, JASON, dB Motions, Tidal
Professional Experience
Confidential
Intuit Data Engineering And Analytics IDEA
- ETL development and administration using Informatica Big Data Edition 9.6.1HF1
- On boarded data from new sources Amazon Web Services e.g. Oracle, MySQL, SQL Server, flat file into Big Data platform HDFS/Hive, Vertica, Netezza using Informatica 9.6.1HF1.
- Worked with Engineering, Operations, DG, DQ and QA to bring into Intuit's main data platform new datasets from various internal product groups e.g. Mint, Tax, Small Business, etc. and own the process end-to-end.
- Provided suggestions for process optimization and ETL automation.
- Embedded PMCMD scripts within Python wrapper, to help automate ETL loads from various flat file feeds
- Importing and exporting data into HDFS using Sqoop
- Configured the mapping, to run in a Hive Environment, where Oracle is a source system and Netezza as target
- Involved in creating Hive tables, Partitioning loading with data and writing hive queries which will run internally in map reduce.
- Implemented merge load type ETL solution, to load incremental data into HDFS/Hive Hadoop 2.3.0-cdh5.1.3
- Experienced in managing and reviewing Hadoop log files.
Confidential
Enterprise Data Management Auto finance Sales Reporting project
- ETL development and administration using Informatica 9.5.1
- Created Informatica ETL code from functional and technical specs.
- Prepared documents for business requirements document, Source to Target documents and metadata document
- Wrote SQL queries to analyze business problems Data Profiling on users requirements
- Interacted with area sales managers, regional managers and operational analysts for verifying user requirements, managing change control process and updating existing documentation.
- Managed and coordinated with system support, development team offshore , and vendors.
- Code migration and Support
Confidential
Enterprise Systems and Data Management
- Extracted HL7 messages from message queue using Informatica unstructured data transformation and Informatica B2B DT studio.
- Implemented Proof of concepts on Hadoop stack and different Big Data analytic tools, migration from different databases i.e. Teradata, Oracle, and MySQL to Hadoop.
- Implemented ELT solutions to populate The IBM Healthcare Provider Dimensional Model, sourcing it from UPMC clean staging database, source landing, MDM and also from the Oracle HealthCare Data Warehouse foundation HDWF model.
- Mentored UPMC staff on Netezza and Informatica best practices and performance tuning.
- Developed the ETL standards and Code review guidelines
- Troubleshooting root-cause analysis and solution development
- Defining and standardizing attributes and entities with development teams
Confidential
Data Migration Oracle to Netezza, BI Foundation EGenome
- Design and Develop ETL process using Informatica, Oracle and Teradata
- Designed new process for Informatica upgrade from Informatica 8.6 to Informatica 9.1
- Created PMCMD shell scripts to help automate ETL loads from various flat file feeds
- Identified bugs in existing mappings by analyzing the pipeline data flow, evaluating transformations and fixing the bugs so that they conform to the business needs and redesign the existing mappings for improving the performance.
- Involved in doing large scale migration of the data warehouse from oracle to Netezza, using Informatica as the ETL tool along with Power Exchange for Netezza
- Extracted xml files from oracle source CLOB , parsed it using Informatica xml parser transformation and transform and load into oracle database.
- Extracted data from JASON string using Java transformation and B2B DT studio.
- Confidential Informatica Upgrade to 9.1, DIAD, Media Pro
- Implemented Informatica services on grid as well as made the ones on single nodes High Available on VERITAS cluster environment and involved in backing up and restoring Domain, repository content and exporting content from repository to repository.
- Installed ODBC connection to connect to several different databases like Netezza, Vertica and Greenplum. Also registered different plug-in on Informatica repository for functionality of these databases during POC.
- Maintained ODBC and Oracle TNS files on Informatica servers for connectivity to different databases like oracle, SQL server, Cache database and Netezza
- Worked with Informatica technical support on several Informatica related technical issues and resolved them.
- Confidential WEBPRINT
- Developed ETL jobs using Teradata utilities like TPUMP, FLOAD and MLOAD to load data.
- Wrote BTEQ scripts to transform data.
- Wrote Teradata SQL overrides for various transformations in the mappings for better performance of the Informatica loads and for customization of the operation of the transformations
- Assisted to create ETL high level and low level design documents
- Query Optimization for improving the performance of the data warehouse Involved in Unit testing, User Acceptance Testing to check whether the data is loading into target, which was extracted from different source systems according to the user requirements
Confidential
ETL Developer
- Involved in analysis of source systems, business requirements and identification of business rules
- Responsible for developing, support and maintenance for the ETL processes using Informatica PowerCenter
- Installation of oracle client tools and establishing connectivity to databases
- Creating and assigning appropriate roles and privileges to users depending on the user activity
Confidential
Programmer Analyst
- Testing and querying the database and handled the performance issues effectively
- Performed database and SQL tuning to improve performance of loads
- Developing PL/SQL programs for data manipulation
- Creating table spaces, tables, indexes, users, etc.
- Implemented table partitioning to improve performance and data management
- Assisted in logical and physical database design