Sas Programming Resume Profile
3.00/5 (Submit Your Rating)
PROFESSIONAL SUMMARY
- 15 year's experience in Decision Sciences / Statistical Modeling, SAS Programming and Marketing / Finance / Operations Analytics
- Well qualified to be a Data Scientist with marketing / finance / Technology / business applications
- Excellent domain knowledge intrinsically in the functional areas of marketing, finance, operations
- Trained mind for Business Analysis, Strategic Thinking, Corporate Planning, Pricing, Optimization, Statistics
- Skilled in integrating ideas across the knowledge spectrum to draw conclusions for tangible results
- Statistical experience is in modeling and analysis using univariate and multivariate statistical techniques such as in predictive modeling ordinary regression, logistic regression, CHAID, bayesian belief networks , survivor analysis / proportional hazard regression, analysis of variance, time series analysis, chi squared automatic interaction detection / CHAID, discriminant analysis, cluster analysis, factor analysis data reduction . SAS was the software used for data manipulation and modeling.
- Good quantitative skills to draw meaning out of data by simpler techniques such as classification, cross-classification analysis, correlational studies for finding relationships in data, and applied mathematical skills for other progressive model-building efforts using discrete and functional relationships.
- Graduate Courses in statistics include probability theory, mathematical statistics, stochastic processes, statistics for decision making, probability and statistics, statistical methods, regression analysis, design of
- Experiments / analysis of variance, time series analysis, multivariate methods, scaling methods, marketing models etc.,
- Strong conceptual background in principles and strategies of marketing, finance and operations and ability to think 'enterprise' and deliver business solutions with these theories even in an analytical environment through broad-based as well as technique-specific business education.
- Experienced in working with both longitudinal, cross sectional data, pricing, optimization etc.,
- Concepts in marketing, modeling, analytics, statistics, SAS
- Innovative ideas in profiling, then segmentation and then targeting with marketing and incentive programs. Associations between usage rate, brand loyalty and involvement.
- Good knowledge of information technology by experience in the enterprise resource-planning package working in a client/ server environment.
- Statistical software SAS BASE SAS, SAS STAT, SAS MACROS Etc., IN UNIX, PC, Mainframe Platforms and Microsoft office products were used for most of the projects done right from graduate school. SPSS was also used in graduate school.
- Experienced to use SAS Enterprise Guide, SAS Enterprise Miner
- Cross Functional And Interdisciplinary Interests
EXPERIENCE
Confidential
- A Project in Reliability. The failure distribution of laser jet printers are modeled using Weibull Distribution with estimated parameters. The failure probabilities per each month from each of the originating month of sales is estimated using Weibull distribution and multiplied by the respective sales of each originating month. Standing at each of the month the theoretical failure numbers as above arising from each of the respective originating earlier months are all added to give the total expected failures to be reckoned theoretically as of those months. These numbers are posed against the actual failures as per history and a function is fitted to minimize the sum of squared errors using non linear regression techniques. This gives the trajectory of failures in the shape of 'bath tub ageing curve' This curve can be used to predict or forecast warranty costs across time which can be incorporated into pricing decisions. Programming was done in SAS on SAS Enterprise Guide version 5.1.
- A Project in Text Mining. Survey data with descriptive comments in addition to scale based numerical scores about the user friendly performance of HP's Global Procurement tool are analyzed using SAS text Miner 12.1 as embedded within SAS Enterprise Miner. The significant terms in the comments and their associations concept linking are the basis of analyses for predictive modeling of sentiments. Terms are parsed and filtered for synonyms, cardinality of too few frequencies and clusters of associations using appropriate nodes. If there are too many terms the dimensionality of terms are also reduced using singular value decomposition similar to principal component analysis to reduce the computational burden in model fitting. Prdictive model is fitted to estimate and predict sentiments. Numerical score data is analysed using traditional statistical mehods for significant differences between subpopulations of procurement type, demographic data etc.,
Confidential
- SAS programming, modeling and analytics for retail private label credit card business.
- Data management, mining and account selection for regulatory compliance. Done with programming in SAS.
- Coding complex logic according to criteria for segmentation and partitioning of accounts
- Set theoretic notions in partitioning and selection of accounts
Confidential
- Statistical modeling and testing across the spectrum in a research environment. Protocol is formulating hypotheses, operationalizing the constructs with variables that capture the constructs, devising appropriate statistical testing procedures, conducting the tests using SAS, interpreting the results and writing.
- SAS hard coding as well as use of SAS Enterprise Guide version 5.1
- Techniques used are Regression, Logistic Regression, Generalized Logit, CHAID, Survivor Analysis and ANOVA. In Level one analysis graphics used are Charts such as Bar Charts, Pie Charts, and plots such as Spline Plots, Line Plots in 2D and 3D context
- The constructs tested are about student performance, predictive modeling to identify and early warn at-risk students, evaluating instructor performance, administrator oversight etc.,
- Data captured online, real time in a learning management system and integrated with demographic and background information
Confidential
- Profiling is finding the degree of association between the variable of interest adopter / non-adopter of a brand or product and some other targetable attribute of the customer so that they can be differentiated, segmented and targeted with pricing, promotional stimuli appropriate for them. From existing data, profiles can be identified with cross tab cell frequencies and also statistically examined by chi squared tests of independence after making data categorical, if necessary . These methods were used to profile adopters of a personal hygiene product. Then the customers were segmented across these profiles and each segment was examined for their media habits and response elasticity for pricing, discounts and other promotional incentives. SAS was used for data manipulation and modeling.
- Maximally discriminating attributes between buyers and non-buyers of a personal hygiene product were identified using discriminant analysis and the resulting discriminant function. Profiling between adopters and non-adopters was examined along an intersection of these attribute levels and segmentation and targeting, as described earlier, was adopted for each of the significant sub-groups. SAS was used for data manipulation and modeling.
- Maximally discriminating variables between buyers and non-buyers of a personal hygiene product were identified using discriminant analysis. These variables were used as independent variables in a logistic regression procedure and probability of purchase estimated for each prospect. A cut-off probability was established to determine the target audience. Marketing programs in terms of pricing, promotion and distribution were strategized for this audience. SAS was used for data manipulation and modeling.
- Examining brand switching and transience in the trial and repeat environment of a frequently purchased product and profiling the brand loyal versus the switchers and the intervening effects of competitive promotions. Customers are segmented on the usage rate and particular attention paid to heavy users. Identify interaction effect degree of association or independence between usage rate and switching by use of chi squared tests .
- Transition probabilities between company brand and competitive brands in the trial and repeat environment of a frequently purchased product are estimated using multi-nomial logit regression for the sub-population in each row representing the present choice of brand. These conditional probabilities are put together to make up the transition matrix. These transition probabilities are assumed to be stationary from repeat to repeat and used to project and forecast future demand across time periods taking into account buying intervals starting from current volume. This is a method of predictive analytics.
- Forecasting the level, trend and seasonal projections of demand longitudinal data on a quarterly and monthly basis by smoothing for level, trend and seasonality in the time series data using Holt and Winters model. Both classical and Box and Jenkins methodology was tried. Cross sectional variables such as the firm's price, competitor's price, promotional incentives and macroeconomic indicators were later built into the model. The final objective was to identify a price that will optimize revenue and profit. The modeling approach was applied in the case of demand for some consumer durable products such as washing machines, television sets etc.,
- Designed an experiment to test the significant differences between the mean responses in a deposit campaign where the treatment factors were interest rates and other promotional features
- Estimating and Predicting the credit worthiness of prospective clients of a lending institution using logistic regression and predictive modeling. The efficacy of the model was examined for significance from sample to population inference, goodness of fit, prediction and explanation. SAS was used for data manipulation and modeling.
- Clustering of customers was performed with groups homogeneous within and heterogeneous between with the hierarchical clustering procedure. This is unsupervised discrimination. Then these clusters were examined for differences in their marketing response. SAS was used for data manipulation and modeling.
Confidential
- Incorporating robustness in mailer selection for direct marketing. Developed a method to incorporate and handle sampling risk in predictive model-based selection of mailing list by using multiple models with bootstrap sampling instead of one single predictive model where variability in prediction can only be theoretical. Variance of prediction across multiple models characterizes the sampling risk a measure to minimize forecast error . Predictions are consolidated and risk is characterized to produce a pareto optimal efficiency frontier by a method of constrained optimization which offers a choice -highest possible return given a degree of risk across various risk profiles characterized by the different combinatorial selections of the calling lists- for the decision maker. He can choose according to his attitude for risk. Return optimized in the model is the risk adjusted contributed value where the risk here is the financial risk. Thus this project is a mixture of marketing, finance and statistics. The present application is in database marketing for GE Consumer Finance. Programming was done in SAS.
- Improving the predictive probabilities in a Bayesian Belief Networks by appropriate binning. Developed an algorithm to determine a heuristically ideal that is minimizing the maximum regret with respect to a criterion considering blocks of related cluster of variables which consist of a group of all the related parents-children-siblings connected even by a step relationship at a time - i.e. meaning relatively a local cluster at a time is used The decision is in exercising the options in discretizing that is determining the binning level cutoffs of continuous variables resulting in different state-space combinations. The objective is to achieve the best resolution of the predictive relationships in the context of the causal structure of a Bayesian Belief Network by choosing a proper choice of discretized state space. The principle used is that information content and predictive relationship in all a' posteriori predictions are enhanced when you are as far away from independence as possible by appropriate binning in the choice of the discretized state-space. As a method to operatioanalise this performance, the formula traditionally used for chi-square statistics is used to foretell independence though inference type extensions of the decision scenarios is not used at this stage. Programming was done in SAS.
- Prediction of multiple classes using logistic regression in a sequential binary protocol instead of a one shot cumulative logit which gives a biased lopsided prediction in favor of the majority class. In the sequential binary, the first binary model predicts the majority class versus the rest in the validation sample based on a cut-off probability determined in the training sample which maximizes the sum of sensitivity and specificity in the ROC curve. Those classified in the rest in the validation sample is again classified between the next majority class versus the rest by the next binary model in the sequence which has been fitted to classify those respective groups, again based on a probability cutoff chosen to maximize sensitivity-cum-specificity for those groups. The sequence proceeds until the final two groups are classified by that respective binary model. There will be as many binary models estimated with the appropriate groups as the number of classes minus one. This protocol produces more even classification with higher accuracies for all classes rather than a cumulative logit in one shot. Also, after determining the model structure stepwise procedure with the entire data, model is robust validated with training-validation split in a rotation schedule as explained below. Programming was done in one stretch by using macros in SAS.