← Blog

Database for the Digital World— 1.3 (Review & Exercise)

One way you can work on your SQL skills without local data access is through theSQL Exercisewebsite. Once you register, you will be provided with the name of the tables and columns and the query you need to write. After you write your SQL you can check for accuracy by clicking on the "Run" button.

Here is a couple of snapshots of how SQL exercise work.

I find the above website to be a good tool to practice SQL without installing any database engine on your local machine.

The next section will have exercise on all the elements we learned in the previous articles using the database schema below. Answers will be provided at the end of this article to review your work.

Let's assume the above database schema belongs to a company you are consulting with and you need to answer the following questions. (I would suggest that you create the table and populate them as a part of the exercise. However, if you want the script for the tables in the above diagram, let me know. I will upload it in Github.)

1. The total number of departments, the total number of employees and the total number of managers.

2. List the employees in the "Marketing" Department.

3. Total number of managers with title "Vice President" and name of the "Vice President" who has held that title the longest

4. List the Employees who have been in more than one department.

5. Find the newest employee and the employee who has been with the company the longest.

6. Find the % of Employees who are managers

7. List the employees and their titles who joined the company in the first year of operation.

8. List the current lowest and the highest salary.

9. Total number of Male and Female Employees

10. The total number of employees hired each year in the past three years.

With this, I will wrap up the first chapter of "Database for Digital Marketers". In the next chapter, I will cover the basics of data analysis and various methods of understanding, analyzing and interpreting marketing data.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Answers:

— 1. The total number of departments, the total number of employees and the total number of managers.

Select count(*) from departments

Select count(*) from dept_manager

Select count(*) from employees

— 2. List the employees in the "Marketing" Department.

Select e.First_name,e.Last_name

from employees e

Join dept_emp b

On e.emp_no=b.emp_no

Join departments d

On b.dept_no=d.dept_no

Where d.dept_name='Marketing'

— 3. Total number of managers with the title "Vice president" and name of the "Vice president" who has held that title the longest

Select count(*) from

Titles where title = 'Vice President'

Select top 1 e.First_name, e.last_name, datediff(DD,from_date,to_date)/365 as number_years_as_vp

From

employees e

Join Titles t

on e.emp_no=t.emp_no

where title = 'Vice President'

Order by datediff(DD,from_date,to_date) desc

— 4. List the employees who have been in more than one department.

Select Emp_no, count(dept_no)

From dept_emp

Group by Emp_no

Having count(dept_no) > 1

— 5. Find the newest employees.

Select first_name,last_name

From employees where hire_date = (Select min(hire_date) from employees)

— 6. Find the % of Employees who are managers

Select Count(Manager)*100/Count(AllEmployee)

From

(Select

Case

when

m.emp_no is not null then m.emp_no else null end as Manager,

e.emp_no as AllEmployee

From employees e

Left join dept_manager m

On e.emp_no= m.emp_no) a

— 7. List the employees and their titles who joined the company in the first year of operation.

Select first_name,last_name,titles.title

From employees

join titles

on employees.emp_no=titles.emp_no

where hire_date <= (Select dateadd(dd,365,min(hire_date)) from employees)

— 8. List the current lowest and the highest salary.

Select min(salary), max(salary)

From salaries

Where to_date> GETDATE()

— 9. Total number of Male and Female Employees

Select count(case when gender='M' Then emp_no else null end) as MaleemployeesCount,

count(case when gender='M' Then emp_no else null end) as FemaleemployeesCount

From employees

— 10. The total number of employees hired each year.

Select Year(hire_date) as hire_year,count(*) from employees

Group by Year(hire_date)

Conclusion:

Now that you have the SQL basics under your belt, we will learn to drive insights from historical data and predict performance which in turn will help us make decisions. The industry term for this process is Data-Driven analysis. Here is the process briefly discussed in 3 steps:

Analyze: Analyze historical data from the organization as well as market and provide insights.

Predict: Forecast or predict performance based on historical data.

Decide: Make decision i.e. take action to invest in relevant marketing programs and campaigns.

Data-driven marketing is an iterative process. i.e. once you reach the third step of "Decide" or "Take Actions", you go back to step 1 and repeat these steps again to address changes(market as well as organization) and optimize. One can write a book in each of these steps but today I will dedicate a couple of paragraphs to each.

Analyze:

The very first step of data analysis is data exploration. In most companies, your data engineer will have the data in the reporting database for you to analyze. In some scenarios though, especially in startups, when you start a new marketing campaign you may have to work with your data engineer to ensure proper tracking of the campaign level insights(acquisition, revenue, retention).

Once we have the data ready, we summarize or aggregate data to answer some questions that are important for our business. We want to know how much revenue has been generated on a monthly or quarterly basis. We will also want to know how much does a customer spends on average and so on.

When a company is launching a new product, they will be interested in knowing the revenue share of the new product with respect to total revenue or how this product revenue compares to the other products by the same company. It is also very important to compare the performance of the product with respect to the industry and its major players. For example, if the product market is growing at 30% and the newly launched product is growing at 15%, they may want to revisit their marketing campaigns given the new product itself is competitive. These questions will vary depending on the business objectives. And the quantitative measure of these key business objectives is known as a Key Performance Indicator.

Predict:

Forecasting or predicting is the process of estimating future performance based on historical data. As we already know, it is impossible to predict a future event with 100% certainty; be it weather, election, company revenue or customer retention. So, what we do is make an estimate, calculate the likelihood or probability of an event outcome. There are various statistical algorithms already available to address different scenarios and one needs to understand these and use them accordingly. Many of these statistical algorithms are highly complex and can read data minutely that is impossible for a human eye to catch. Based on these readings an algorithm discovers a pattern and uses that pattern to predict future events. This process of reading, discovering and estimating data is popularly known as Machine Learning.

Machine Learning is at the core of intelligent data analysis also known as Data Science, a term very recently popularized, thanks to the data revolution. It is a field of study that combines various traditional disciplines (Mathematics, Statistics, Computer Science, Business) along with Industry-specific knowledge to extract insights from data.

Decide:

The sole purpose of the previous steps (Analyze and Predict) is to guide us to make decisions and drive future investments and campaigns in a more analytical way rather than solely depending on our intuition. For digital marketers, the decision is to build a marketing strategy to channel marketing investments towards different marketing vehicles in rewarding markets and products. Popular digital marketing vehicles include advertising, email campaigns, promotions, incentives, website, social media as well as the online customer support community.

In the coming weeks, I will discuss each of the above 3 steps for specific marketing operations and how it can benefit from data-driven marketing. I will start with customer segmentation.

Customer segmentation is an act of grouping customers into groups of individuals that are similar in multiple ways relevant to marketing. One group, for example, can be women in mid-thirties who spend more than $1000 a couple of times. We use an unsupervised machine learning model to form clusters. This technique is known as clustering. Once we find a way to name these clusters and find what works for them, we can market to each of these groups separately. As we acquire new customers, we can use supervised classification models to assign a new customer to these groups.