Saturday, October 31, 2015

White Wine-Data Analysis Using R

In this post, i would showcase how we can use R programming language to do exploratory data analysis for any data science project. Exploratory data analysis is the crucial part of any data science project. It not only helps in understanding the data better but also helps in finding any anomalies in the data. In the following link , i have used various questions to help understand the concept better. I have used univariate , bivariate and multivariate analysis to analyze white wine data set . Exploratory data analysis would help in determining the most important characteristics that impact the quality of white wine.

Please follow the below Link:

https://rawgit.com/rajivgrover009/DATA_ANALYST_NANO_DEGREE_UDACITY/master/Project%204/Project4_data_analysis_whitewine.html

Friday, October 30, 2015

Effective Data visualization-Data Science


An efficient Data scientist is not only efficient at driving insights from the data but also explaining the findings to the audience. Data visualization plays an important part in Data Scientist career. Even though, Data visualization , in its literal terms , is plotting the data only but it is lot different than the Exploratory data analysis. It is more of explanatory data analysis. Think of it as a bridge between end product of data science and the audience , which can be management in an organization or just the general viewer.
   Using effective data visualization techniques, a data scientist can narrate the crux of the problem. Narrative structure can fall in one of the following categories:

Tools and languages : Some of the commonly used tools and libraries are :
  • D3.js : http://d3js.org/
  • Dimple.js : http://dimplejs.org/
  • Tableu public :https://public.tableau.com/s/
  • https://plot.ly/
In an attempt to learn this useful skill, i created my first Hybrid Data Visualization using d3.js and dimple.js. 
To give a little background,I have used dimple.js and d3.js to draw a bar chart for co2 emissions in G7 countries at five years interval starting from year 1961 to 2010. This bar chart provides a comparative description about the increasing co2 emission levels. User can use click on the year value on the right hand side to view the respective co2 emissions in the year. The chart, by default, starts the animation and continues the animation until user select any particular year.

Below is the link for the same:



Here is an static thumbnail for the chart: