Here are 10 more examples that illustrate some of the differences between Excel and SQL.
✦1✦
Excel formula to extract the first word from a cell
=LEFT(A1,FIND(" ",A1)-1)
SQL query to select the first word from a column in a table
SELECT SUBSTRING_INDEX(column_name, ' ', 1) FROM table_name;
✦2✦
Excel formula to calculate the average of the top 3 values in a range of cells
=AVERAGE(LARGE(A1:A10,{1,2,3}))
SQL query to calculate the average of the top 3 values in a column in a table
SELECT AVG(column_name) FROM
(SELECT column_name FROM table_name
ORDER BY column_name DESC LIMIT 3) subquery;
✦3✦
Excel formula to count the number of occurrences of a specific word in a range of cells
=COUNTIF(A1:A10,"word")
SQL query to count the number of occurrences of a specific word in a column in a table
SELECT COUNT(*) FROM table_name WHERE column_name LIKE '%word%';
✦4✦
Excel formula to calculate the standard deviation of a range of cells
=STDEV.S(A1:A10)
SQL query to calculate the standard deviation of a column in a table
SELECT STDDEV_SAMP(column_name) FROM table_name;
✦5✦
Excel formula to generate a random number between 1 and 10
=RANDBETWEEN(1,10)
SQL query to select a random row from a table
SELECT * FROM table_name ORDER BY RAND() LIMIT 1;
✦6✦
Excel formula to count the number of cells in a range that contain a specific word
=COUNTIF(A1:A10,"*" & "word" & "*")
SQL query to count the number of rows in a table that has a specific word in a column
SELECT COUNT(*) FROM table_name WHERE column_name LIKE '%word%';
✦7✦
Excel formula to calculate the sum of the squares of the values in a range of cells
=SUMPRODUCT(A1:A10,A1:A10)
SQL query to calculate the sum of the squares of the values in a column in a table
SELECT SUM(column_name * column_name) FROM table_name;
✦8✦
Excel formula to generate a random number from a normal distribution with a mean of 0 and a standard deviation of 1
=NORM.INV(RAND(),0,1)
SQL query to select a random row from a table with a probability proportional to the value in a specific column
SELECT * FROM table_name ORDER BY RAND() * column_name LIMIT 1;
✦9✦
One common task in data engineering is data cleaning, which involves identifying and fixing errors or inconsistencies in a dataset. This can be done in Excel using a combination of formulas and functions to identify and correct errors in the data. For example, the following Excel formula can be used to replace empty cells with a default value:
=IF(ISBLANK(A1), 0, A1)
In SQL, data-cleaning tasks can be performed using a combination of conditional logic and string manipulation functions. For example, the following SQL query can be used to replace empty values in a column with a default value:
UPDATE table_name SET column_name = 0 WHERE column_name IS NULL;
✦10✦
Another common task in data engineering is data transformation, which involves converting data from one format or structure to another. This can be done in Excel using a combination of formulas and functions to manipulate the data in a worksheet. For example, the following Excel formula can be used to convert a date in the format “YYYY-MM-DD” to the format “MM/DD/YYYY”:
=DATEVALUE(A1)
In SQL, data transformation tasks can be performed using a combination of string manipulation and date/time functions. For example, the following SQL query can be used to convert a date in the format “YYYY-MM-DD” to the format “MM/DD/YYYY”:
SELECT DATE_FORMAT(column_name, '%m/%d/%Y') FROM table_name;