Saturday, 29 November 2014

How to become Data Scientist

past year, interest in data science has soared.Nate Silver is a household name, companies everywhere are searching for unicorns, and professionals in many different disciplines have begun eyeing the well-salaried profession as a possible career move.
In our recruiting searches here at Burtch Works, we’ve spoken to many analytics professionals who are considering adapting their skills to the growing field of data science, and have questions about how to do so. From my perspective as a recruiter, I wanted to put together a list of technical and non-technical skills that are critical to success in data science, and at the top of hiring managers’ lists.
Every company will value skills and tools a bit differently, and this is by no means an exhaustive list, but if you have experience in these areas you will be making a strong case for yourself as a data science candidate.
Technical Skills: Analytics
1. Education – Data scientists are highly educated – 88% have at least a Master’s degree and 46% have PhDs – and while there are notable exceptions, a very strong educational background is usually required to develop the depth of knowledge necessary to be a data scientist. Their most common fields of study are Mathematics and Statistics (32%), followed by Computer Science (19%) and Engineering (16%).
2. SAS and/or R – In-depth knowledge of at least one of these analytical tools, for data science R is generally preferred.
Technical Skills: Computer Science
3. Python Coding – Python is the most common coding language I typically see required in data science roles, along with Java, Perl, or C/C++.
4. Hadoop Platform – Although this isn’t always a requirement, it is heavily preferred in many cases. Having experience with Hive or Pig is also a strong selling point. Familiarity with cloud tools such asAmazon S3 can also be beneficial.
5. SQL Database/Coding – Even though NoSQL and Hadoop have become a large component of data science, it is still expected that a candidate will be able to write and execute complex queries in SQL.
6. Unstructured data – It is critical that a data scientist be able to work with unstructured data, whether it is from social media, video feeds or audio.
Non-Technical Skills
7. Intellectual curiosity – No doubt you’ve seen this phrase everywhere lately, especially as it relates to data scientists. Frank Lo describes what it means, and talks about other necessary “soft skills” in his guest blog posted a few months ago.
8. Business acumen – To be a data scientist you’ll need a solid understanding of the industry you’re working in, and know what business problems your company is trying to solve. In terms of data science, being able to discern which problems are important to solve for the business is critical, in addition to identifying new ways the business should be leveraging its data.
9. Communication skills – Companies searching for a strong data scientist are looking for someone who can clearly and fluently translate their technical findings to a non-technical team, such as the Marketing or Sales departments. A data scientist must enable the business to make decisions by arming them with quantified insights, in addition to understanding the needs of their non-technical colleagues in order to wrangle the data appropriately. Check out our recent flash survey for more information on communication skills for quantitative professionals.
The next question I always get is, “What can I do to develop these skills?” There are many resources around the web, but I don’t want to give anyone the mistaken impression that the path to data science is as simple as taking a few MOOCs. Unless you already have a strong quantitative background, the road to becoming a data scientist will be challenging – but not impossible.
However, if it’s something you’re sincerely interested in, and have a passion for data and lifelong learning, don’t let your background discourage you from pursuing data science as a career. Here are a few of the resources we’ve found to be helpful:
Resources
  1. Advanced Degree – More Data Science programs are popping up to serve the current demand, but there are also many Mathematics, Statistics, and Computer Science programs.
  2. MOOCs –CourseraUdacity, and codeacademy are good places to start.
  3. Certifications – KDnuggets has compiled an extensive list.
  4. Bootcamps – For more information about how this approach compares to degree programs or MOOCs, 
  5. Kaggle – Kaggle hosts data science competitions where you can practice, hone your skills with messy, real world data, and tackle actual business problems. Employers take Kaggle rankings seriously, as they can be seen as relevant, hands-on project work.
  6. LinkedIn Groups – Join relevant groups to interact with other members of the data science community.
  7. Data Science Central and KDnuggets – Data Science Central and KDnuggets are good resources for staying at the forefront of industry trends in data science.
  8. The Burtch Works Study: Salaries of Data Scientists – If you’re looking for more information about the salaries and demographics of current data scientists be sure to download our data scientist salary study.

Tuesday, 25 November 2014

Smart Cities in India


Smart City  offers economic activities and employment opportunities to a wide section of its residents, regardless of their level of education, skills or income levels


By 2050, the world will witness a mass exodus of people into cities. 2 out of every 3 people will be living in urban areas which translates into 6.3 Billion urban dwellers. . By 2050, Asia and Africa will account for 86% the world’s urban population. India alone will add more than 400 Million people to its cities – that is twice the population of Brazil today. The consequence of this migration - especially in a fast growing economy like India - is a significant increase in the demand and consumption of resources. The coal reserves in the existing coal mines in India are likely to get exhausted in little more than 50 years at the current rate of consumption. India’s oil imports account for the biggest share in the Current Account Deficit. Cities contribute to 70% of India’s GDP. The growth of cities therefore is inevitable.

However, unplanned growth and distribution of resources could prove to be catastrophic to the economy and impede progress. Building sustainable and Smart cities from scratch and retrofitting sustainability features in already existing cities is the only way out.
Smart Cities are the only perceivable solution to urbanization of this scale.

A Smart city includes a structure that is resource efficient and has a minimal impact on the environment. Among other things, a smart city reduces the energy and water requirement by employing technology and smart construction & design techniques, reduces the generation of solid waste and uses renewable sources to meet energy requirements.

Additionally, to promote a more convenient way of life, a smart city incorporates a sophisticated Information and Communication Infrastructure. A robust transport network to move people is also established.
While retrofitting smart city-like features into an already existing city is a possible solution, it comes with its limitations. Retrofitting is more expensive and inconvenient primarily because of high replacement costs and limitations of the existing structures. The market for retrofitting is still in its nascent stages and therefore not fully understood. While areas like lighting, air conditioning, etc. have seen some technological innovation, other areas like disaster resistance are untouched. Permits and legal requirements are additional challenges.

Although building smart cities from ground up is a more feasible solution, it is a daunting task. The smart city concept has been experimented with several times in the past in different parts of the world. However, some have fallen victim to failure because of various reasons. Many unsuccessful attempts in the past were characterized by ambitious sizes of these cities, bold investments, technological misfits, poor urban planning, etc.


The Indian smart city landscaped must be engineered to encourage developing smaller and more realizable cities with technologies adapted to the Indian ecosystem and innovative financing. Building a strong support infrastructure, recreation options and promoting thriving businesses to flourish will make the city more habitable and desirable. Formation of communities must be allowed to follow a natural path.
Building smart cities is investment heavy and time consuming. However, India has to embrace sustainable methods soon to avoid eventual chaotic circumstances. The success of smart cities cannot be attributed to technologies alone. People must be educated about the importance and necessity of sustainable practices and the advantages of investing in sustainable settlements.

Thursday, 20 November 2014

SMART CITY , GIS and FIVE PILLARS

What is smart  city

People migrate to cities primarily in search of employment and economic activities
beside better quality of life. Therefore, a Smart City for its sustainability needs to offer
economic activities and employment opportunities to a wide section of its residents,
regardless of their level of education, skills or income levels. In doing so, a Smart City
needs to identify its comparative or unique advantage and core competence in
specific areas of economic activities and promote such activities aggressively, by
developing the required institutional,physical, social and economic
infrastructures for it and attracting investors and professionals to take up
such activities. It also needs to support the required skill development for such
activities in a big way. This would help a Smart City in developing the required
environment for creation of economic activities and employment opportunities.

Smart City  GIS and Five pllars
GIS  five “pillars” namely Power, Water, Transport, Solid Waste Management and Safeguarding (Public Safety) and identifies enablers to better utilize Information and Communications Technologies (ICT). Governance, Planning, Infrastructure & networks, Data analytics, Geographic Information Systems (GIS) and Cyber Security have been identified as enablers. In reality GIS is similar to any other IT enterprise component, but for some reason GIS has been identified as a separate enabler. Ideally a “secure” GIS based IT enterprise should have been considered which - offers capability for “analytics”, can be used for “planning” and thus support in effective and efficient “governance”.
.
Geographic Information Systems
The report refers to GIS as a “…system that involves superimposition of several layers of geo-data and information systems in a specific sequence to create a comprehensive geospatial / geographic information system”. Technically this statement still holds good, but gone are the days when GIS was used for viewing thematic maps and little bit of spatial analysis. Today GIS systems offers much more than that. A GIS can be integrated with – non-spatial data, multiple databases, multiple systems, real-time sensors and devices and so on and can be made available on cloud, web, mobile or desktop environments. I would prefer calling it “A system / solution that can capture, store, manipulate, analyze, manage, and present all types of data in a geographical context”.
Some observations specific to the way forward :
Power – Smart Grid does not find any reference to GIS. GIS is critical component of a smart grid facilitating effective and efficient - network management, asset management, consumer Information management, workforce management and outage management. Integration of call centres, billing, payments and other sensors from SCADA with GIS in a real-time scenario can offer actionable intelligence for plugging pilferage's, outage management and restoration and so on.
Solid Waste - Long term proposal recommends applying of GIS and GPS solutions to enable route optimization and process improvement. With GIS deployment and GPS enablement proposed as a medium term plan, ideally route optimization and process improvement can be accomplished at this stage itself. Most of the GIS softwares come with route planning and optimization tools now a days.
Water– While it addresses GIS integration, the report misses on pipeline distribution management, asset management, water quality monitoring and management etc. which are key components of water systems. With percentage of Non-revenue water (NRW) high in the Indian context, GIS can offer actionable intelligence to bring down the NRW.
Traffic – Smart traffic management could be accomplished on a GIS based platform. The smart surveillance can be integrated with such system and graduated to Intelligent Traffic Management System. In addition such system can also be used for planning, monitoring and maintenance of transport networks, asset management etc. which are critical components of traffic management.
Safeguarding (Public Safety) – City Surveillance, Command Control and CAD are addressed as separate entities. Ideally this should be an integrated system. It describes “CAD vehicles”, “GIS & GPS enabled vehicles” - in reality CAD is a software solution / system. On command control, a GIS based CAD system can be scaled up by integrating video feeds and multiple sensor data to offer enhanced locational awareness of the incident location. This can further be graduated to City Surveillance systems.

Saturday, 15 November 2014

Data Scientist in 8 easy steps

Data Scientist in 8 easy steps