Calculating Sales Insights: SUM Function and GROUP BY with Multi-Level
Date Analysis:
In this example, we will demonstrate how to use the
SUM function and the GROUP BY clause in SQL
Server. The SUM function is used to calculate the total sum
of a numeric column, while the GROUP BY clause is used to
group rows that have the same values in specified columns into summary
rows.
The OUTPUT clause in T-SQL is a powerful feature that
allows you to retrieve information about rows affected by
INSERT, UPDATE, DELETE, or
MERGE statements. This information can be used for purposes
like confirmation messages, auditing, or archiving. You can also insert
the results into a table or table variable for further processing.
The OUTPUT clause can be used with the following SQL statements:
INSERT
UPDATE
DELETE
MERGE
When performing an INSERT operation, the OUTPUT clause can capture
values from the newly inserted rows, such as identity column values or
computed columns. Similarly, during a DELETE operation, it can retrieve
details about the deleted rows. The syntax for the OUTPUT clause is as
follows:
Analytic functions process a group of rows to compute aggregate values
while preserving detailed insights. Unlike aggregate functions, which
reduce groups to a single row, analytic functions can output multiple rows
for each group. They are ideal for calculating metrics like moving
averages, running totals, percentage distributions, or identifying top-N
results within a group.
You may refer the link
https://learn.microsoft.com/en-us/sql/t-sql/functions/analytic-functions-transact-sql?view=sql-server-ver16.
Different types of analytic functions in SQL Server:
CUME_DIST:
Stands for cumulative distribution
Calculates the percentage of rows with values less than or equal
to the current row's value.
Formula: text{CUME_DIST} = {Rank}/{Total Rows}
Values range from 0 to 1, inclusive.
PERCENT_RANK:
Represents the percentage of rows with values less than the
current row's value.
Values range from 0 to 1, but the highest-ranked value is
excluded.
Source Table
CUME DIST & PERCENT RANK
PERCENTILE_CONT:
The PERCENTILE_CONT function
calculates a percentile based on a continuous distribution of
values. It is used to find the percentile value for a specified
percentage within a group of rows. It may return a value that
doesn't exist in the dataset, as it interpolates between values.
If the position is an integer, return the value at that position
and if the position is not an integer, interpolate between the two
closest values.
PERCENTILE_DISC:
The PERCENTILE_DISC function
calculates a percentile based on a discrete distribution of
values. It returns the value at the specified percentile, which
must exist in the dataset.
In this example, we will demonstrate how to semi-unpivot data.
We have a source sheet / table containg "Date" and "9_A, 9_B, 9_C, 10_A, 10_B, 10_C" and so on columns. And here 9_A means Grade 9th and A Section.
Now our target is to unpivot this type of data into multiple columns where sections are columns and date and Grade will be rows
So if you see here we are not unpivoting all data we are unpivoting and also keep unpivoted data.
I have also seen very interesting topic related with unpivot using Cross Apply. You may refer the link https://www.mssqltips.com/sqlservertip/7835/unpivot-data-sql-server-cross-apply.
Input Data
Output Data
This is an example we'll trying to unpivot data using T-SQL
SELECT TOP 5 Employee_Name, Salary FROM tbl_hrSampleData ORDER BY Salary
DESC;
2. Count of Employees by Department
SELECT Department, COUNT(*) AS EmployeeCount FROM tbl_hrSampleData GROUP
BY Department;
3. Average Salary by Department
SELECT Department, AVG(Salary) AS AverageSalary FROM tbl_hrSampleData
GROUP BY Department;
4. Marital Status Distribution
SELECT MaritalDesc, COUNT(*) AS Count FROM tbl_hrSampleData GROUP BY
MaritalDesc;
5. Gender Distribution
SELECT Sex, COUNT(*) AS Count FROM tbl_hrSampleData GROUP BY Sex;
6. Employees Hired in the Last 5 Years
SELECT Employee_Name, DateofHire FROM tbl_hrSampleData WHERE DateofHire >=
DATEADD(YEAR, -5, GETDATE());
7. List of Employees Terminated with Reason
SELECT Employee_Name, TermReason FROM tbl_hrSampleData WHERE Termd = 1;
8. Employees with Performance Score of 'Exceeds'
SELECT Employee_Name, PerformanceScore FROM tbl_hrSampleData WHERE
PerformanceScore = 'Exceeds';
9. Employees in Active Employment
SELECT Employee_Name, EmploymentStatus FROM tbl_hrSampleData WHERE
EmploymentStatus = 'Active';
10. Total Salary Expense by Department
SELECT Department, SUM(Salary) AS TotalSalary FROM tbl_hrSampleData GROUP
BY Department;
11. Top Performing Employees
SELECT Employee_Name, EngagementSurvey FROM tbl_hrSampleData ORDER BY
EngagementSurvey DESC;
12. Employees Grouped by Race Description
SELECT RaceDesc, COUNT(*) AS Count FROM tbl_hrSampleData GROUP BY
RaceDesc;
13. Count of Employees Participating in Diversity Job Fairs
SELECT COUNT(*) AS DiversityParticipants FROM tbl_hrSampleData WHERE
FromDiversityJobFairID = 1;
14. Employees in Production Department
SELECT Employee_Name, Position, Department FROM tbl_hrSampleData WHERE
Department = 'Production';
15. Employees with High Absences (>10)
SELECT Employee_Name, Absences FROM tbl_hrSampleData WHERE Absences > 10;
16. List of Managers and Their Employees
SELECT ManagerName, Employee_Name FROM tbl_hrSampleData ORDER BY
ManagerName;
17. Age of Employees (based on DOB)
SELECT Employee_Name, DATEDIFF(YEAR, DOB, GETDATE()) AS Age FROM
tbl_hrSampleData;
18. Employees Grouped by Termination Reason
SELECT TermReason, COUNT(*) AS Count FROM tbl_hrSampleData GROUP BY
TermReason;
19. Employees with Specific Positions
SELECT Employee_Name, Position FROM tbl_hrSampleData WHERE Position IN
('Software Engineer', 'Production Technician I');
20. Engagement and Satisfaction Statistics
SELECT AVG(EngagementSurvey) AS AvgEngagement, AVG(EmpSatisfaction) AS
AvgSatisfaction FROM tbl_hrSampleData;
21. Top Three Paid Employees
WITH cteSalary (Rnk, Salary) AS ( SELECT DENSE_RANK() OVER (ORDER BY
Salary DESC) AS Rnk, Salary FROM tbl_hrSampleData ) SELECT
tbl_hrSampleData.EmpID, tbl_hrSampleData.Employee_Name,
tbl_hrSampleData.Salary FROM tbl_hrSampleData INNER JOIN cteSalary ON
tbl_hrSampleData.Salary = cteSalary.Salary WHERE cteSalary.Rnk <= 3 ORDER
BY cteSalary.Rnk;
22. Top Three Paid Employees Based on Each Department
WITH cteSalary (Rnk, Salary, Department) AS ( SELECT DENSE_RANK() OVER
(PARTITION BY Department ORDER BY Salary DESC) AS Rnk, Salary, Department
FROM tbl_hrSampleData ) SELECT tbl_hrSampleData.Department,
tbl_hrSampleData.EmpID, tbl_hrSampleData.Employee_Name,
tbl_hrSampleData.Salary FROM tbl_hrSampleData INNER JOIN cteSalary ON
tbl_hrSampleData.Salary = cteSalary.Salary AND tbl_hrSampleData.Department
= cteSalary.Department WHERE cteSalary.Rnk <= 3 ORDER BY cteSalary.Rnk;
23. Count of Employees Based on Their Marital Status and Gender
Select sex Gender, Divorced, Married, Separated, Single, Widowed FROM (
Select empID, sex, MaritalDesc FROM tbl_hrSampleData ) As scrTbl PIVOT (
COUNT(empID) FOR MaritalDesc IN (Divorced, Married, Separated, Single,
Widowed) ) AS pvtTbl;
Export Data related queries
1. Export Data as XML
SELECT Employee_Name, EmpID, Department, Salary FROM tbl_hrSampleData FOR
XML AUTO, ROOT('Employees'), ELEMENTS;
Note: The above query will export the data in XML format.
FOR XML AUTO: This option generates XML output based
on the structure of the query.
ROOT('Employees'): This specifies the root element
name for the XML output.
ELEMENTS: This option generates elements for each
column in the result set.
1a. Export Hierarchical Data as XML
SELECT ManagerName AS Manager, (SELECT Employee_Name, EmpID FROM
tbl_hrSampleData AS Employees WHERE Employees.ManagerName =
tbl_hrSampleData.ManagerName FOR XML AUTO, ELEMENTS ) AS Team FROM
tbl_hrSampleData GROUP BY ManagerName FOR XML AUTO,
ROOT('Organization');
2. Export Data as JSON
SELECT Employee_Name, EmpID, Department, Salary FROM tbl_hrSampleData FOR
JSON AUTO;
Note: The above query will export the data in JSON format.
FOR JSON AUTO: Automatically formats each row as a
JSON object.
Results are returned as a JSON array containing objects for each row.
You may refer to additional queries using the embedded tool below:
Deterministic vs Nondeterministic Functions in SQL
Deterministic vs Nondeterministic Functions in SQL
In SQL, functions are categorized as deterministic or
nondeterministic based on whether they produce consistent
results under the same conditions.
Deterministic Functions
Deterministic functions always return the same result when called with the
same input values and database state.
ABS(-5): Always returns 5.
ROUND(123.456, 2): Always returns 123.46.
DATEADD(day, 5, '2025-04-15'): Always adds
5 days to the given date.
Nondeterministic Functions
Nondeterministic functions may return different results even when called
with the same input values, depending on factors like system state or
time.
GETDATE(): Returns the current date and time, which changes
with each call.
NEWID(): Generates a unique identifier (GUID) that is
different every time.
RAND(): Produces a random number, which varies with each
execution.
Key Differences
Here are the main distinctions between deterministic and nondeterministic
functions:
Deterministic Functions: Predictable and can be used in
indexed views or computed columns.
Nondeterministic Functions: Not predictable and cannot
be used in indexed views.
Determining if a function is deterministic or nondeterministic
That can be check whether a function is deterministic by querying the
is_deterministic object property for the function. The
example below determines if the function dbo.NosToWord is
deterministic.:
SELECT OBJECTPROPERTY(OBJECT_ID('dbo.NosToWord'), 'IsDeterministic') AS IsDeterministic;
The result will be 1 if the function is deterministic and
0 if it is nondeterministic.
Built-in function determinism
All of the string built-in functions are deterministic, except for FORMAT.
The following built-in functions from categories of built-in functions
other than string functions are always deterministic.
CAST
CONVERT
PARSE
TRY_PARSE
TRY_CONVERT
TRY_CAST
ABS
ACOS
ASIN
ATAN
ATN2
CEILING
COALESCE
COS
COT
DATALENGTH
DATEADD
DATEDIFF
DAY
DEGREES
EXP
FLOOR
ISNULL
ISNUMERIC
LOG
LOG10
MONTH
NULLIF
POWER
RADIANS
ROUND
SIGN
SIN
SQRT
SQUARE
TAN
The following built-in functions are nondeterministic: