ScienceOps or SciOps Tasks

Helping other businesses grow in terms of technology is mostly a need, in some start-ups and also to the growing business that wants business continuity.  Technology nowadays makes creation of services delivered at a faster rate due to the services that we have like Cloud services.  Amazon Web Services, OpenStack and many more have offered their infrastructure as a service for those technology business to help deliver high quality applications.

Users of the apps are now mobile on their laptops, they are now on smart phones, laptops, tablets, and many other gadgets ready to run apps.  Also, Apple devices are around and all those user apps are best paired with data on some server, the cloud.  The cloud works as an authentication, knowledge-base and even the data storage of these infrastructure.  Regardless of the application a good system and network infrastructure is best to have.

These infrastructures are built on top of on-premise computers/servers and even with the cloud.  Cloud services reduces costs is such a way that you don’t have to spend time waiting for arrival of servers, you can scale up and down and you can even scale automatically, yes automatic scaling.  But what is automatic scaling?  Your infra if properly managed can be designed to grow its instances/servers when there is a high need of resources and to shrink down in numbers when only few instances are needed.  All these are beneficial.

With all of these things, we have infra security.  Securing, maintaining and keeping up-to-date with the latest software, fixes and patches are needed on your cloud infra.  However, we are not sure where we are doing right or where we are missing if don’t audit.

Designing a good secured infra needs some more.  We have to monitor things in order to know downtimes, also we have to measure.  In measuring the infra performance we create metrics that checks the status or measurement of every parameters for the server.  We then have to collect this and increase the capability to analyze it.

The data scientist for this work are ScienceOps. They are mostly building good infrastructure, but that is not all.  Even after seeing what’s good in your infra, you will need to know what’s in for your business.  ScienceOps might need some background in business, marketing and strategic planning, and also the apps that you are targeting for measurements.  SciOps will then need to create analysis of your users data aside from the server logs.  With the users data, they can show you how you are fairing with the market.  They can show you if you are growing over time or just having stagnant users no longer interacting with your app.

Your marketing success can be measured and a lot more on the choices of products, services and behavior of your customers or users of your app.  With proper metrics, right analysis and more insight gains, the management team and product team can design, update and direct the product / service that you have in the right direction that satisfies your customers and service users.

5 D’s of Data Science

Here are the 5D

5 Ds of Data Science

  1. Data
  2. Digitalization
  3. Description
  4. Depiction
  5. Discovery



In data science, the most needed is the data, the observations or examples.  With this, we can describe how much, how strong, what are the value or measurement there is about a situation or a thing.  Data existed when define the description of an event or if we measure something.  This is the most important building block that we need to have in doing Data Science tasks.  With data, we are able to show quantity and quality, and this will be the basis of our equations and statistics.  We observe or sometimes use instruments or probe in order to gather data for our analysis or research.



We cannot process raw data when it is not digitized or put into a computer system or encoded into forms that can be processed.  The format is not limited to text, graphics, spreadsheets, vectors, audio, video, we can use any digital format that we like.  Through digitization, we can speed up the process of analysis and procedures being applied to gather the measures in statistics.  We can then infer from the findings of things, and we can create more insight.  Digitization makes the sharing of information easier as the data can be stored and retrieved for future use.



Through the tools that we have, mathematical equations and statistics, we can describe the data that we have.  We can determine if assumptions are right or wrong through hypotheses that we formulate.  We can then deduce from what we have gathered, and those will help us understand more, and can guide us on the next steps on what we can do with data in order to solve a problem or understand a situation or use it to teach machines/computers. These machines in return will be put into practical use which can aid the human ability in different aspect of our lives, not limited to traffic, medicine, marketing, economics, planning, production, operations, understanding behaviors and many more.



In Data Science, where use to do machine learning, we mine information, create training and testing sets, we can then depict or predict the future.  Also with visualization, we can explain what we have just found out through insights.  We can share the information available for consumption at a wide range of audience from academe, profession, medicine, science and the like.  With depiction/visualization we can help different people understand what we have just found out.  This is where data science becomes an art, a place of creativity and targeting with mass consumption.



At the end of most research of a Data Scientist, a discovery from different insights is mostly been found or through the process clarity comes as the prize of hard work.  The discovery from the tasks conducted can help to predict reality, give warnings and inform the people.  Most stakeholders are the pharmaceutical company, doctors of medicine through BioStatistics and analysis, and some business or entreprise.  The information uncovered can be a great help in making future decision on improving medicine, process, product or strategy such as those used in marketing campaign, designing educational things and also providing new products/services for the benefit of the people.

Machine Learning, A Look in the Past

Before the Big Data become popular, there were at the back of Web 1.0 the machine learning of the past which utilizes Market Basket Analysis. These are very dominant in advanced e-commerce stores and online shops. The Job sites also utilized these technology before, and how did they implement it? Cookies, not those in your kitchen jar, but those text files that remembers your preferences, your visited sites and the things that you’ve clicked on the internet.

And what was that? Machine Learning, a part of the task of a so-called Data Scientists of today. Facebook analyzes all of our likes, shares, streams today, Twitter can also do it, I have even tried to do sentiment analysis of tweets using python. Google with their intelligent algorithms, Yahoo the early adopter of Hadoop for HDFS (a Big Data System). A lot of other database management systems like SQL are there used widespread. In those days, MatLab is a mostly used software, SPSS, SAS, S-Plus, and now R. Nowadays there is Pig to simplify MapReduce, the language for Hadoop management.

But who are those that have benefit from data science in the past? Amazon, the online book store have utilized data science, data mining, data analysis in order to show you the most relevant product that you can buy, they are now an online store and have even adopted into Cloud Service Provider company. Their algorithms can help upsell and show you related items to what you have already bought.

The most successful in utilizing BIg Data and Data Science is Walmart, they know how much to display on store, they know how much to carry on their inventory and they even know when you will buy your next coffee beans, sugar and even the infant milk and cereals that you consume and buy on your scheduled shopping. The likes of forecasting sales, that is why Walmart grew because of this so called business intelligence, it is data science, they use algorithms, mathematical equations, operations research tools in order to manage and understand the consumer behavior.

So the realization of Data Scientists today are thing of the past, but now, a successful e-scientist must have the skills in diverse fields (multidisciplinary-skilled) like business / marketing, economics, mathematics, statistics, operations research, some IT skills, big data and creativity. Yes, creativity, without it there will be no spark of wisdom, and this is mostly part intuition, insight and looking the world/data at a different angle to predict, to deduce and to induce.