Sr. Cassandra Developer Resume
New York, NY
PROFESSIONAL SUMMARY:
- Over Seven years of professional IT experience in Big Data Technologies and Data Analysis with 2+ years of hands - on experience in development and design of applications in Java and its related frameworks.
- Experienced in installing, configuring and monitoring the Datastax Cassandra Cluster, DevCenter and OpsCenter .
- Excellent understanding of Cassandra Architecture and management tool like OpsCenter .
- Commendable knowledge on read and write processes, including SSTables , MemTables and Commitlog .
- Experience with querying on data present in Cassandra cluster using CQL (Cassandra Query Language).
- Used DataStax OpsCenter and NodeTool utilities to monitor the cluster.
- Experience in taking the data backups through NodeTool snapshots .
- Experience in moving the SSTables data on to the live cluster .
- Experience in using Sqoop to import the data on to Cassandra tables from different relational databases.
- Tested the application and the cluster with different consistency levels to check for the writes and reads performance with respective to Consistency Level .
- Experience in Importing data from various sources to the Cassandra cluster using Java API’s .
- Experience in data modeling of Cassandra.
- Experience in creating tables involving collections, TTLs, counter, UDTs as part of data modeling.
- Hands on experience on major components of Hadoop Ecosystem like Hive, HDP, Hbase, Pig and Sqoop, Impala, Oozie, Zookeeper, MapReduce.
- Good experience in using Elastic Map Reduce on Amazon Web Services (AWS) cloud for supporting data analysis projects.
- Basic Knowledge of Spark and SCALA programming.
- Knowledge of managing and scheduling backup and restore operations .
- Experience in benchmarking Cassandra Cluster using Cassandra stress tool .
- Developed wrapper applications involving File I/O processing, Data Mining using python.
- Read data from local files, XML files, excel files and involved in Input Output Processing using different packages in python.
- Experience in python, Jupyter, Scientific computing stack (numpy, scipy, pandas and matplotlib).
- Good experience in all the phases of Software Development Life Cycle (Analysis of requirements, Design, Development, Verification and Validation, Deployment).
- Hands on experience in application development using Java, .NET, RDBMS, and Linux shell scripting.
- Experience working with JAVA J2EE, JDBC, ODBC, Servlets .
- Experience using Eclipse, Visual Studio and DBMS like Oracle, MYSQL and SQL Server .
- Evaluate and propose new tools and technologies to meet the needs of the organization.
- Good knowledge in Unified Modeling Language (UML) and Agile Methodologies.
- Excellent team player, self-starter with effective communication skills.
TECHNICAL SKILLS:
Apache Cassandra: Cassandra, DataStax, DevCenter and OpsCenter, Node tool, Spark on Cassandra
NoSQL Database: Cassandra, HBase.
Relational Database: SQL Server, MySQL, Oracle.
Hadoop Sysytem: HDFS, Map Reduce, HBase, YARN, Hive, Spark, Oozie, Zookeeper, Sqoop, and Pig, Eclipse, Visual Studio, NetBeans, Pycharm.
Servers: Apache Tomcat, Web Logic, Web Sphere, JBOSS.
Tools:: GIT, Maven, OpsCenter, DevCenter, Node tool, JIRA, ANT.
Operating Systems: Windows, Macintosh, Linux.
PROFESSIONAL SUMMARY:
Sr. Cassandra Developer
Confidential, New York, NY
Responsibilities:
- Database Architecture and Administration activities which involved Data Modeling, Configuration, Administration, Monitoring, Security Management, Performance Tuning, Replication, Backup/Restore and Troubleshooting of issues.
- Involved in the process of Conceptual and Physical Data Modeling techniques.
- Good Command on CQL to run queries on the data present in Cassandra Cluster with multi DC's in 8 nodes each.
- Analyzed the performance of Cassandra cluster using Nodetool TP stats and CFstats for thread analysis and latency analysis.
- Used the DataStax OpsCenter for monitoring the health of the cluster with respect to the keyspaces and the tables designed.
- Modified Cassandra.yaml and Cassandra-env.sh files to set the configuration properties like node addresses, Memtables size and flush times etc.
- Experience in creating tables involving collections, TTL's, Counters, UDT's as part of physical data modeling.
- Involved in moving the SSTables data on to the live cluster.
- Good Knowledge on read and write paths and used different types of Compaction Strategies for performance.
- Created the necessary keyspaces and modeled column families based on the queries.
- Tested the application and the cluster with different consistency levels to check for the writes and reads performance.
- Improved performance of the tables through load testing using Cassandra stress tool.
- Involved with the admin team to setup, configure, troubleshoot and scaling the hardware on a Cassandra cluster.
- Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs.
- Hands on experience with Apache Spark using Scala . Implemented spark solution to enable real time report from Cassandra data.
- Experienced in implementing Spark RDD transformations, actions to implement business analysis.
- Experienced with batch processing of data sources using Apache Spark.
- Performance optimization dealing with large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark.
- Worked on Scala Play framework for application development.
- Experience in taking the data backups through NodeTool snapshots.
- Used the Java API to load the data into Cassandra Clusters.
- Worked on Apache Solr for indexing.
- Finding and resolving the issues that caused in the Cluster environment.
- Created documentation for benchmarking the Cassandra cluster for the designed tables.
Environment: DataStax 4.8, Cassandra 2.0/2.1, DevCenter, OpsCenter, Spark, Shell Scripting, Cqlsh, Eclipse, Scala.
Sr. Cassandra Developer
Confidential, Austin, TX
Responsibilities:
- Have complete knowledge and understanding with Cassandra Architecture.
- Responsible for building scalable distributed data solutions using Datastax Cassandra.
- Involved in business requirement gathering and proof of concept creation.
- Created the necessary keyspaces and modeled column families based on the queries
- Involved in Hardware installation and capacity planning for cluster setup.
- Involved in the hardware decisions like CPU, RAM and disk types and quantities.
- Worked with the Linux admin team to set up, configure, initialize and troubleshoot a Cassandra cluster.
- Wrote and modified YAML scripts to set the configuration properties like node addresses, replication factors, memTable size and flush times etc.
- Used the Datastax Opscenter for maintenance operations and Keyspace and table management.
- Experience in working with cluster management tool like NodeTool.
- Involved in taking the data backups through NodeTool snapshots .
- Involed in moving the SSTables data on to the live cluster .
- Restoring the backups through SSTableloader tool in Cassandra.
- Generated SSTables from csv file with the help of SSTable Simple Unsorted Writer class in Java.
- Used Sqoop to import the data on to Cassandra tables from different relational databases like Oracle, MySQL .
- Tested the application and the cluster with different consistency levels to check for the writes and reads performance with respective to Consistency Level.
- Monitoring the Cassandra cluster with the help of Visualization management tool, OpsCenter.
- Familiar with all the internal tools of Cassandra.
- Involved in POC related to Spark Application using SCALA as programming language.
- Basic Knowledge of Spark and SCALA.
- Basic knowledge of Cassandra-Spark connector to load data to and from Cassandra.
- Loading the data to Cassandra Cluster with the help of Java API.
- Involved in data modeling the tables in Cassandra.
- Good knowledge of data modeling techniques in Cassandra.
- Used Maven for project build, dependency and documentation.
- Finding and resolving the issues that caused in the environment.
- Analyzed the log files to determine the flow of the application and to solve issues.
- Knowledge on using Solr for indexing.
Environment: Datastax 4.7, Cassandra 2.1, DevCenter, Cqlsh, OpsCenter, Shell Scripting, Maven, Eclipse
Cassandra Developer, Data Analyst
Confidential, Dallas, TX
Responsibilities:
- Involved in data modeling the tables in Cassandra.
- Involved in implementing different data modeling techniques in Cassandra.
- Experience in working with cluster management tool like NodeTool.
- Familiar with all the internal tools of Cassandra.
- Created data models for customers data using Cassandra Query Language(CQL)
- Finding and resolving the issues that caused in the environment.
- Involved in the hardware decisions like CPU, RAM and disk types and quantities.
- Created several tables as a part of data modeling and determined the performance of the table
- Monitoring the Cassandra cluster with the help of Visualization management tool, OpsCenter.
- Prepared new Datasets from raw data files using import techniques and modified existing Datasets using Set, Merge, Sort, and Update, Formats, Functions and conditional statements
- Involved in Dashboard development utilizing Tableau Desktop Software.
- Developed MapReduce programs for the analysis of customer data.
- Developed wrapper application using python which invokes the Java application and performs various File I/O operations
- Involved in Hardware installation and capacity planning for cluster setup.
- Used python scientific computing libraries such as numpy, scipy, pandas, matplotlib for data mining.
- Developed notebooks using Jupyter and involved in software development using python in Spyder.
Environment: Python 2.7, Python 3, Jupyter, Spyder, Pyscripter, Enterprise Guide, Datastax, Cassandra 2.1, DevCenter, Cqlsh, OpsCenter
Java/Hadoop Developer
Confidential, Lewisville, TX
Responsibilities:
- Imported Data from Different Relational Data Sources like RDBMS, Teradata to HDFS using Sqoop.
- Worked on writing transformer/mapping Map-Reduce pipelines using Apache Crunch and Java.
- Data modeled the new solution based on Cassandra and on the use case.
- Imported Bulk Data into Cassandra file system Using Thrift API.
- Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run Map Reduce jobs in the backend.
- Perform analytics on Time Series Data exists in Cassandra using Java API
- Designed and implemented Incremental Imports into Hive tables.
- Worked in Loading and transforming large sets of structured, semi structured and unstructured data
- Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data
- Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
- Experienced in managing and reviewing the Hadoop log files.
- Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
- Implemented the workflows using Apache Oozie framework to automate tasks
- Worked with Avro Data Serialization system to work with JSON data formats.
- Worked on different file formats like Sequence files, XML files and Map files using Map Reduce Programs.
- Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit.
- Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
- Developed scripts and automated data management from end to end and sync up between all the clusters.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing.
Environment: Cassandra 1.2, Hbase, Hive, HDFS, Sqoop, Shell Scripting.
Java Developer
Confidential
Responsibilities:
- Involved in all the phases of the life cycle of the project from requirements gathering to quality assurance testing.
- Developed Class diagrams, Sequence diagrams using Rational Rose.
- Responsible in developing Rich Web Interface modules with Struts tags, JSP, JSTL, Confidential, JavaScript, Ajax, GWT.
- Developed presentation layer using Struts framework , and performed validations using Struts Validator plugin.
- Created SQL script for the Oracle database
- Implemented the Business logic using Java Spring Transaction Spring AOP.
- Implemented persistence layer using Spring JDBC to store and update data in database.
- Produced web service using WSDL/SOAP standard.
- Implemented J2EE design patterns like Singleton Pattern with Factory Pattern.
- Extensively involved in the creation of the Session Beans and MDB , using EJB 3.0 .
- Used Hibernate framework for Persistence layer.
- Extensively involved in writing Stored Procedures for data retrieval and data storage and updates in Oracle database using Hibernate.
- Deployed and built the application using Maven .
- Performed testing using JUnit.
- Used JIRA to track bugs.
- Extensively used Log4j for logging throughout the application.
- Produced a Web service using REST API with Jersey implementation for providing customer information.
- Used SVN for source code versioning and code repository.
Environment: Java, J2EE, JSP, Struts, JNDI, HTML, XML, UML, Rational Rose, Eclipse, Apache Tomcat, MySQL, Java Script, AJAX, SVN.