Saturday, February 27, 2016

Missing Social Media Skill set for businesses

Use of social media has proliferated at an exponential rate in last one decade. Where in Feb 2004, Facebook had just handful of Harvard students using newly launched the facebook.com, today, in 2016, the user base of Facebook has spanned over nearly one third of planet population. Facebook has not only impacted the way we interact with our friends but has also impacted how businesses interact with their customers. Businesses need to take advantage of growing impact of social media on their customer preferences. In this article, we will analyze some the key issues that need to be addressed by businesses in order to keep pace with social media impact.

KEY ISSUES:
  • ·         Contemporary Workforce is ill equipped.
  • ·         Lack of social media courses at universities.
  • ·         Employee social media advocacy is underestimated


ANALYSIS OF ISSUES:

Contemporary Workforce is ill equipped:

Companies have widely adopted the use of social media in order to create brand awareness and to increase their user base. Nearly 90% of the companies are using social media networks to chase the customer base, which are highly influenced by social media.
However, there is one key hurdle in terms of fully unfolding the potential of social media; the workforce at these companies is not well trained to take advantage of social media. There is considerable skill gap in the workforce due to lack of formal training in the companies. Even though the use of social media is very common among the general public yet only 12%(out of 2100 ) of companies are using social media effectively, as per the Harvard Business Review survey in 2010. More recent research by Capgemini confirmed that there is not significant increment in these numbers. How social media is used within a company is changing day by day and companies need to keep pace with this change. Increasing number of jobs on specialized skills pertaining to social media is evident that businesses need people who are well equipped to drive their marketing strategies on social media.
The challenge is not only the availability of formal training programs but also the right training solutions. Most of the employees at the companies do not have time for in depth training. Therefore, it is very important to consider this fact and provide on demand and mobile-friendly training solutions. Nevertheless, with ever changing use and growing features of social media, companies need to invest on training programs to train their workforce.

Lack of social media courses at universities:

Another important factor in the gap of skill set is the lack of specialized courses at universities. Effective use of social media in business is essentially a skill, which needs to be taught as a separate course at universities. Even though, it is very important to incorporate the adequate training programs to train the existing work force, it is equally important to train the next generation of work force right from the university. Strategy of bridging the social gap is highly effective at university level.
Even though, such specialized social media courses are available at universities but most of such programs are limited to marketing and communication degree programs only. Use of social media at companies is no longer limited to specialized managers but the use of social media has been highly decentralized within the company. It is highly important to create the required understanding and incorporate these courses as core courses in all applicable degree programs. These programs will not only offer a strong foundation for future workforce but would also create unprecedented opportunities for businesses to effectively use social media.
           
Employee social media advocacy is underestimated:

Another important factor that businesses often overlook is the importance of employee advocacy on social media. As already mentioned the effective use of social media is no longer limited to specialized managers but has become highly decentralized. A less known fact in Starbucks successful twitter campaign “Tweet-a-coffee”, which generated $180,000 in sales is that Starbucks own employees were also actively involved in tweeting and posting. The employee advocacy program at Starbucks encouraged employees to participate in the campaign and create a buzz. There are other companies such as Zappos and southwest airlines who have harnessed the power of employee advocacy in execution of social media campaigns.
However, majority of companies still overlook the importance of employee advocacy. The fundamental fact behind creating audience or followers on social media network is through people and employees can help achieve this fact at very fast pace. By encouraging the employees at company to share the updates on social media networks, companies can slowly build a substantial user base and it does not required any special funding or investment. For example: a midsize company with 100 employees, with average 250 followers can create unique 25,000 followers by just encouraging its employees to participate in these campaigns. The fundamental driving force in employee advocacy is not just sharing the content but it is word of mouth, which create huge impact in social media. Apparently, content shared by employees create 8 times more impact than done otherwise. Very few companies have truly embraced this fact and have taken advantage of employee advocacy in social media campaigns. Seeing the benefit of employee advocacy we cannot undermined its contribution in a successful campaign.
Therefore, Companies needs to start focusing on employee advocacy. Even though employee advocacy can help business a lot but it needs to be monitored in some way to ensure that it does not leave a negative impact on brand image. This is only possible when there is enough awareness among employees.
CONCLUSION:

The growing use of social media and the impact of social media on user buying preferences are evident that all companies (mid-level, small-level and big-level) needs to focus on social media to sustain. Large user base of social media networks has made these networks the most important media for advertising. If done properly, companies can create a buzz about their product in very less time as compare to what is possible through traditional media such as newspapers, TV etc. At Facebook alone, over 1.44 billion user spends on average more than 20 minutes each day, which is 20 % of total online time.
Although many companies have realized the true potential of social media and have invested heavily yet there are still large number of companies who have either not embraced the social media trend or have overlooked it. Social media has not only changed the way businesses advertise their product but has also changed the overall paradigm of customer interaction. Now days, it is not sufficient to create a brand awareness on social media but is also important to address customer concerns and complaints through these networks. Social networks have changed the traditional way of customer interaction, where customer used to call the helpdesk and log a complaint. Customers have become ever more transitory these days and are highly influenced by the online reviews, product ratings etc. Therefore, many of these complaints can go unnoticed and have a negative impact on business, if these interactions are not handled properly.
The ubiquitous use of social media and its impact on business has proved the fact that business need specialized skill set to deal with the complexity involved. This is where things get tricky because companies neither have the required skill set or nor have required time to fully reaped the potential of social media. It is true from our discussion so far that a successful social media marketing campaign must involve employee at all levels but who owns this initiative and who takes the responsibility of user actions? Employee updates, if not guided properly, can do more harm than good. Employees’ actions and participation in these campaigns need to be well aligned with overall objective. Therefore, it is essential to train the employees how brand image can get impacted by sharing given type of update. This awareness in employees would not only ensure a positive impact but would also create ensure that the employee actions are well aligned with business objectives.
The next question that pops up in creating employee advocacy is what motivates these employees to participate in these campaigns. This is essentially not enforcement on employees. Employees need to be morally and self-motivated to help achieve the business objective through these campaigns. Companies can help create a positive health culture within the company that motivates employees to participate in these campaigns. It only then possible that these campaigns and achieve full success.

Nevertheless, companies need to start focusing on importance of social media by creating required training programs; creating employee awareness; encouraging employee participation; handling customer interaction and by keeping pace with changing features and new technology in social media world. Only then it is possible that companies can reap the true potential of social media world.

References:
http://www.fastcompany.com/3055665/the-future-of-work/inside-the-growing-social-media-skills-gap
http://www.fastcompany.com/3053233/hit-the-ground-running/how-to-turn-your-entire-staff-into-a-social-media-army

Monday, February 22, 2016

Security and Privacy in the Information Era

Unprecedented opportunities offered by big data in advance science, health care, economic growth, education, social interaction and entertainment are being used by business across all these verticals. However, the underlying risk of data privacy security remains a major bottleneck in big data era. We have witnessed increasing number of data breaches in the recent history. The security loop holes in most the data breaches are evident that we have very little control on the access of this data especially when it comes to third party sharing. Aggregation and mining of public data is becoming a common practice among scientists, businesses, clinicians and even government agencies.
 Big data analytics has provided a set of useful open source tools for data mining and modeling but there is still lack of effective frameworks and approaches for ensuring security and privacy in this highly distributed environment. One of the foundation pillars of big data era is ability to share and mine the data but there is very little focus on implementing strict security and privacy principals when it comes to third party data sharing with ever expanding vulnerabilities through these data sets. In this article, we will discuss some of the key issues related to security and privacy in this Big Data Era and would also discuss the SMW (Secure Medical Workspace) for controlled data access.

KEY ISSUES:
  • ·         Increasing data with increasing security vulnerabilities.
  • ·         Incapable existing solutions to protect data.
  • ·         Need of new Approaches to protect data


ANALYSIS OF ISSUES:
Increasing data with increasing security vulnerabilities:
We have witnessed a huge increase in big data accessibility in recent years. Some of this aggregated data is available for public use by government agencies. Increasing computing capabilities provided by modern computing solutions are making it possible to extract and mine massive data sets. For instance, surveillance programs by national security Administration (NSA) are collecting massive amounts of data through data intensive programs. With Utah data center opening, these efforts are anticipated to grow at significant rate. The center’s computation goal is to achieve computing capability to the level of exaflop by 2018. This growth is not limited a government agencies but private businesses, hospitals and researchers are also at the forefront of data collection and mining by utilizing the power of computing. The major security concern with this practice lies in data sharing with third parties. There is very little or no control of the data once it has gone out of the premises of original data collection agencies.
Large scale data collection and sharing is common place with inadequate frameworks to ensure security and privacy of this confidential data. Lack of adequate training and understanding of data security and privacy has led to this situation. Security and privacy concerns are thereby increasing at the same rate of data growth. Incidents of data hacking have become more dangerous due to availability of these massive data sets. Therefore, data leakage has become more alarming than ever before. For instance, data hacking of Utah’s department of health databases in March 2012 led to loss of personal data from 780,000 patients with over 280,000 records of social security numbers.
With more and more businesses engaging in third party use of sharing personal information, security and privacy issues are anticipated to grow.

Incapable existing solutions to protect data:
We have seen tremendous increasing big data analytics tools, both open source and proprietary. However, we have not seen enough frameworks and tools to ensure security and privacy in this changing era, which is centered on data sharing. Existing and traditional solutions to data leakage are highly incapable to deal with the situation. We still see lot of oral or written pledges to protect against data breaches even the NSA relies on oral pledges, which are not effective if the motivation to leak data is stronger than the motivation to protect it. Passwords and authorize access remains at the top when it comes to data security and privacy. Effective password policies couple with strong password guidelines and expiration policy, has remained one of the most useful tool to protect against data breaches.  Even though, passwords can provide a good layer of security against unauthorized access but password are prone to hacking even with strongest password reset procedures in place. We have witnessed the underlying vulnerabilities in password in recent years. Multifactor authentication provides a better approach over the simple password authentication where user requires a password as well as some sort of physical identification method such as finger print etc.
However, all these traditional approaches fail to consider the fact that what happens in an intensive data sharing environment once the data has been delivered to third party. The question is who own the responsibility to protect the data in this highly distributed era of big data? There needs to be new security policies and frameworks in place.

Need of new Approaches to protect data:

As discussed above, traditional approaches and frameworks cannot guarantee a solidified approach to data privacy and security. Data leakage associated with confidential and sensitive information requires that data security and privacy is maintained at all the levels in big data hierarchy. Given the fact that data needs to be shared among entities, it becomes increasingly important to restrict the data access through a virtualized environment. Data Leakage prevention technology provides one such solution. Through DLP, data packets are inspected by location and file classification. However, it becomes too stringent for bother end users and IT staff. Also, it does not protect against accidental or intentional data leakage.
SMW( Secure Medical Workspace), developed by RENCI and university of North Carolina provides an effective solution to data leakage. Originally designed for protecting patient data,this framework can be generalized to other business problems.


SMW allows approved requesters with access to required data on a secure virtual workspace coupled with ability to prevent data sharing. SMW technological features include;
·         Two Factor Authentication for gaining access to SMW
·         Virtualization technology to provide access to required data.
·         Preconfigured virtual machine images to implement security policies.
·         Encryption techniques for data in motion and data at rest.
·         IT management capabilities

CONCLUSION:

With the increasing applications are data analytics, the privacy and security concerns around data are only going to increase in the future. It is true that security frameworks and tools need to be revisited periodically to ensure up to date security policies. However, in this data centric era, it is not only important to update the security frameworks but also devise new methods to ensure security and privacy. Recent data breaches are evident that we need to improve on security and privacy frameworks. We can no longer wait for the breach to happen in order to identify the possible problem with the framework. Industry needs more research in the area of security and privacy to ensure that we don’t lose our fundamental right of privacy in this modern era. Traditional approaches limited to verbal or written agreements and two factor authentications does not provide solid framework to handle security issue related to third parties. The existing policies at workplace need to be customized or re devised, if required to deal with the situation. We have 100 times more information present in these huge data sets, which were not easily available to accessible a decade ago. These huge datasets contains both confidential and sensitive information for significant large number people/entities. One single breach to these huge datasets leads to lose of data sensitive data for all these people/entities. It is true that we have seen very useful applications around data in big data world, which are continuously improving the way we live our life and how business make better decisions but we cannot ignore the fact of possible loss due to unauthorized access to this data. Moreover, any practices which involve in uninformed data collection and sharing need to be tackled in the most appropriate way so that we don’t lose our fundamental right of privacy. All the big market players employ data mining practices to derive insights from the collected data. Target marketing, one of the major areas of data analytics, is one such example of how user activity is being tracked and used by e businesses without any notable user agreement and consensus. We not only need to ensure the data security and unauthorized data access from data hackers but we also need some effective procedures to ensure the collection and mining of data in the ethical manner without sacrificing the privacy of concerned entities.

We have seen continuous evolvement of security technologies as additional vulnerabilities are realized by the anticipated or past data breaches. However, with huge amount of data provided by these massive data sets, we have more at stake and we need to be proactive to ensure the level of security in Big Data era.

References: http://www.renci.org/wp-content/uploads/2014/02/0213WhitePaper-SMW.pdf

Saturday, February 20, 2016

Deep Learning and big Data Analytics



In last one decade, Data mining or Big Data Analytics has become increasingly important for both public and private companies. We have seen a huge increase in new technologies, frameworks and hardware to support the wave of big data in recent years. With companies collecting more and more data, there is huge need to mine this data to derive useful insights and decisions. Some of the most useful applications of data mining are in cyber security, marketing, fraud detection and customer relationship. Data mining techniques are highly focused on creating generalized patterns from massive data sets and use these patterns to help in decision making for future use.
However, the underlying foundation of any data mining or machine learning technique is the goodness of the data. A complex and efficient technique can fall short, if the features used to train the models were not representative. On the other hand, even a simple algorithm can do a wonderful job, if it is provided with right features. So the question that pops up is ,”How to find right set of features in the massive data sets with millions of data points?”. In our following discussion we will discuss some unique challenges involved with data mining and big data and see how deep learning can help overcoming these challenges.

KEY ISSUES:
  • Feature Engineering.
  • Information retrieval and indexing.
  • Dealing with Massive unlabeled/unsupervised data

ANALYSIS OF ISSUES:

Feature Engineering:
Performance and effectiveness of data mining or machine learning techniques i.e supervised or unsupervised algorithms largely depend on the underlying data. No technique can lead to fruitful results if the underlying data is incorrect, non-representative or not used properly. Even though, the exiting tools and technologies provides many options to discover the relationships between the input variables and the target variables yet these options and techniques are meant to deal with small number of features. Most often used data analysis techniques are limited to uni-variate, bivariate and multivariate analysis, which are confined to few number of variables and are focused on single dimensional features. However, in true world, features are not generally single dimensional; Input features can be multidimensional and are important to identify and derive to get the satisfactory results from underlying algorithms. Feature Engineering is most time consuming but important part of any data mining algorithm. Many useful multidimensional features can be discovered with the help of business domain experts. However, in the massive data sets, which are highly unstructured and unorganized many of such useful multidimensional features go undetected and the underlying models are not optimum models. Linear transformation techniques such “Principal component Analysis” are incapable of dealing with nonlinear features.

Deep learning helps in dealing with the situation and provides techniques for automatic extraction of most representative features of the data set. Deep learning algorithms try to emulate hierarchical learning of human brain. Deep learning algorithms have the ability to generalize in non-local ways and to detect patterns beyond nearest neighbors. Deep learning algorithms provide richer generalization of input features by providing a multilayer abstraction approach. At each layers, the features are abstracted and generalized to identify the multidimensionality. For example: an image is composed of different sources of variability in terms of light, shapes of objects and materials. Multilayer abstraction provided by deep learning algorithms can help in separating different sources of variation in data.

Information retrieval and indexing:
Although not related to data mining directly but another major issue related to underlying data is “the efficiency of information retrieval”. Data in today’s world has exceeded the typical storage, processing and computing capacity of traditional databases and data analysis tools. Rise of Big Data technologies have made it possible to store the massive data generated each hour. In addition to volume of data, Big data is also associated with other complexities such as Variety, Velocity and Veracity.
With the growing dependence on data, efficient storage and retrieval of information has become increasingly important. Traditional indexing solutions are no longer useful to improve the situation because this data is huge and not organized as a relational model. Data collected from sources as video streaming, images and audio require more than traditional indexing. This huge amount of data needs semantic indexing so that data can be presented in more efficient manner and can be used as a source for knowledge discovery and comprehension. Deep learning provides solution to implement semantic indexing for efficient information retrieval. Deep learning generates high level abstract data representation, which can be used for semantic indexing instead of using raw data for indexing. Deep learning can not only provide semantic indexing but can also help in uncover the complex relationships and factor leading to knowledge and understanding. Data abstraction and representation makes it possible to store similar representation closer to each other in memory for fast retrieval.

Dealing with Massive unlabeled/unsupervised data:
Another challenge apart from volume and velocity of data is ability to deal with massive unlabeled and unsupervised data. This data contains complicated non-linear features. As explained earlier, deep learning can help in feature engineering by providing a multilayer abstraction but this task become even more daunting when the underlying data is unsupervised. It is therefore becomes essential to decode these complex nonlinear features and use the simpler form in the algorithms. The technique of discovering these nonlinear complex features is called discriminative task.
Discriminative task not only helps in discriminative analysis but can also be used for data tagging to improve searching algorithms. For example, MAVIS- Microsoft Research Audio Video indexing system uses deep learning to enable search with speech. Discriminative tasks have become increasingly important with the growth of digital media collections. Most of this digital media comes from social networks, GPS, medical imaging and image sharing systems. It is highly important to organize and store these images so that they can be browsed and retrieve more efficiently. This huge collection of images is an example of unlabeled data because in technical terms, a picture is a collection of pixels only. We need efficient methods to store and organized this unsupervised and unlabeled data. Text based searches are no longer capable to provide the right solution to this huge collection. One solution is to use automated tagging and extracting semantic information for these images. Deep learning provides useful techniques to construct useful representation of these image and video data in real time, which can be used for image indexing and retrieval.

CONCLUSION:

With growing business opportunities in the field of data, the dependence on data is only going to increase in the future. We have witnessed the exponential growth in this vertical in recent years. Companies have been trying to leverage every possible opportunity provided by the data. Companies are storing each bit and byte of data in its raw format for later use. Even though, it is true that data is king in today’s market but at the same time industry needs to discover efficient ways of storing only the relevant data. This is in contrast with the fundamental fact of Big data but dumping massive data in data lakes can lead to difficulties in retrieval where in the most important piece of data is buried deep underneath. Some more challenges presented by Big data are
  • Inefficiency in dealing with high Dimensional Data
  • Large scale Models
  • Problems with Increment learning

Models based on these huge data sets are not only computationally very expensive but also create difficulties in terms of interpretation. Moreover, the process of model creation and evaluation is an iterative process and it takes many iterations before discovering the right set of parameters and algorithm for a given business problem. Given these huge data sets, the process of models creation and evaluation becomes very time consuming. Even though, techniques such as Principal component analysis and deep learning can help to discover the most relevant features of the given data set yet the process is intensive and requires a deep understanding of business domain and statistical methods.
We have witnessed a huge number of business cases to leverage the power of data in recent years but industry is still trying to come up with optimum solutions to mine the data in real time. Many useful information provided by these huge data sets is time sensitive for example, If a company is trying to predict stock prices then although it is important to observe the trend line based on historical events but is even more important to be able to predict the future in real time.
With increasing use of deep learning techniques in data mining and artificial intelligence field, we can expect the solutions to incremental learning with real time analytics. Deep learning can provide solutions to many of these problems but there is lot of research work required to make deep learning more useful. Open source tools, technologies and frameworks such as R, Python, F#, SKlearn has played a significant role to attain the current level of data industry. We have seen major companies such google, Facebook, yahoo etc. to share their proprietary frameworks to help revolutionizing the industry. The release of TransForce library by Google to aid deep learning, last year, should help in improving the business uses cases of deep learning to solve the real world problems of image recognition, natural language process etc. with even higher accuracy. Nevertheless, it will be interesting to witness this journey and be even more interesting to be part of this journey.

References:http://journalofbigdata.springeropen.com/articles/10.1186/s40537-014-0007-7