T-SQL Tuesday #016 Aggregate Functions #tsql2sday

T-SQL Tuesday
@grrl_geek is hosting a blog discussion on *Aggregate Functions*.  You can head over there to see what all the wonderful bloggers posted on this topic.  If you want to learn more about #tsql2sday then see A Year of Tuesdays: T-SQL Tuesday Meta-Roundup.  Jes has passions for SQL Server, running, cooking, coffee and the Jaycees and to learn more see About Jes.

Aggregate Functions
To be honest this post is not going to have any practical application for aggregate functions.  I considered passing on this blog party but I just can’t stay away from a party.  Instead I decided to use the opportunity as motivation to learn something new.  This post will be an academic exercise with CLR Aggregates.  [Code at end of post]

What to Aggregate?
First I turned to the search engine named after the 1 with one hundred zeros.  Most examples on CLR Aggregates were about concatenation but there was one by @AndrewNovick on Product and one by @DavidJPoole on Linear Regression.  David stated, quite correctly, that most CLR examples were academic in nature (hence the above warning) but he felt he found a practical one.  Wait posts with Math!?  Yes, that post (and this) has some math but the math is not relevant to the learning.  He put together a wonderful post and hopefully you are convinced to perform a left mouse click 😉

Algorithm
What if rather than trying to find the slope of a line for several points as in linear regression we tried to find the algorithm to the rate of growth for several cummulative values?  The general idea is to reverse generate a function from the values.  There is no practical use for this very, very, very generic approach in this post in attempting this.  Again the purpose of the post to learn something new.

Something New
The first new thing created was a Visual C#, Database, SQL Server Project in Visual Studio 2008.  Next, Add Database Connection.  Next, Add Aggregate.  Code the required methods Init(), Accumulate(), Merge() and Terminate().  I used three values n (short) to represent the number of values, x1 (double) to represent the first value and xn (double) to represent the last value and AlgorithmType (enum) to represent the result.  Those three values will of course be inadequate to truly determine the algorithm.  This is just how I decided to make a *guess*.

Deployment
With the first attempt to deploy from Visual Studio there was an error occurred stating *marked for user-defined serialization, but does not implement the “System.Data.Microsoft.SqlServer.Server.IBinarySerialize” interface*.  Trying to create the assembly in Management Studio from the built dll returned the same error.   Having no research luck, I defined the struct as implementing IBinarySerialize and added methods Read() and Write().  The CLR aggregate deployed in spite of the fact CLR was not even enabled!

Usage
With the aggregate in place, it was time to move onto testing.  I created a table with fields: n (increment), xn (value), fx (algorithm type).  Table population led to a couple issues.  First was the arithmetic overflow for the POWER(3, 20).  Setting the range as 1 to 19 per algorithm fixed the first issue but then during testing I found a second issue with the table data involving the math.  (Don’t leave yet)  The logarithmic value for 1 is zero which is not an allowed initial value for my guessing aggregate.  Setting the range as 2 to 19 for the input values but subtracting one for the sequence identifier (n), fixed the second issue. [commented in code below]

Summary
1. CLR Aggregates are rather easy to implement (at least for a non-production quality).
2. Further research on the serialization requirement needs performed.
3. CLR code can live on your server even if CLR is not enabled.
4. There does not appear to be much use or need for CLR aggregates according to my search engine results.

Credits
Jes Borland
Adam Machanic
Andrew Novick
David Poole

See Also
How to: Create and Run a SQL Server Aggregate by using Common Language Run-time Integration
Requirements for CLR User-Defined Aggregates
CREATE AGGREGATE (Transact-SQL)
Invoking CLR User-Defined Aggregate Functions
Time complexity

C# Code
Please do not use for anything but training purposes.

using System;

using System.Data;

using System.Data.SqlClient;

using System.Data.SqlTypes;

using Microsoft.SqlServer.Server;

 

public enum AlgorithmType : byte

{

    Unknown = 0,

    Constant = 1,

    NaturalLogarithmic = 2,

    Linear = 3,

    NaturalLinearithmic = 4,

    Quadratic = 5,

    Cubic = 6,

    ThreeExponential = 7,

    OutOfBounds = 255

}

 

[Serializable]

[Microsoft.SqlServer.Server.SqlUserDefinedAggregate

(

    Format.UserDefined, //use clr serialization to serialize the intermediate result

    IsInvariantToNulls = false, //optimizer property

    IsInvariantToDuplicates = true, //optimizer property

    IsInvariantToOrder = false, //optimizer property

    MaxByteSize = 20) //maximum size in bytes of persisted value

]

public struct GuessAlgorithm : IBinarySerialize

{

    public void Init()

    {

        // Put your code here

        this.n = 0;

        this.x1 = 0;

 

    }

 

    public void Accumulate(double Value)

    {

        // Put your code here

        this.n++;

        if (this.n == 1)

        {

            this.x1 = Value;

        }

        this.xn = Value;

    }

 

    public void Merge(GuessAlgorithm Group)

    {

        // Put your code here

        this.n += Group.n;

        if (Group.x1 < this.x1)

        {

            this.x1 = Group.x1;

        }

        else if (Group.xn < this.xn)

        {

            this.xn = Group.xn;

        }

    }

 

    public SqlString Terminate()

    {

        // Put your code here

        AlgorithmType algorithmTypeGuess = this.guess();

        return new SqlString(algorithmTypeGuess.ToString());

    }

 

    // This is a place-holder member field

    private short n;

    private double x1;

    private double xn;

 

    private AlgorithmType guess()

    {

        double scale;

 

        if ((this.n < 3) || (!(this.x1 > 0)))

        {

            return AlgorithmType.Unknown;

        }

        else

        {

            scale = this.xn / this.x1;

        }

 

        if (scale <= 1)

        {

            return AlgorithmType.Constant;

        }

        else if (scale <= (Math.Log(this.n, 2.71828182845905)))

        {

            return AlgorithmType.NaturalLogarithmic;

        }

        else if (scale <= this.n)

        {

            return AlgorithmType.Linear;

        }

        else if (scale <= (this.n * (Math.Log(this.n, 2.71828182845905))))

        {

            return AlgorithmType.NaturalLinearithmic;

        }

        else if (scale <= Math.Pow(this.n, 2))

        {

            return AlgorithmType.Quadratic;

        }

        else if (scale <= Math.Pow(this.n, 3))

        {

            return AlgorithmType.Cubic;

        }

        else if (scale <= Math.Pow(3, this.n))

        {

            return AlgorithmType.ThreeExponential;

        }

        else

        {

            return AlgorithmType.OutOfBounds;

        }

    }

 

    public void Read(System.IO.BinaryReader SerializationReader)

    {

        this.n = SerializationReader.ReadInt16();

        this.x1 = SerializationReader.ReadDouble();

        this.xn = SerializationReader.ReadDouble();

    }

 

    public void Write(System.IO.BinaryWriter SerializationWriter)

    {

        SerializationWriter.Write(this.n);

        SerializationWriter.Write(this.x1);

        SerializationWriter.Write(this.xn);

    }

}

T-SQL Code
Please do not use for anything but training purposes.

/* cleanup */

IF  EXISTS (SELECT 0 FROM sys.objects WHERE [object_id] = OBJECT_ID(N'[dbo].[graph]’) AND type = ‘U’)

DROP TABLE [dbo].[graph];

GO

/* build */

CREATE TABLE dbo.graph (n smallint NOT NULL, xn decimal(38,19) NOT NULL, fx varchar(20) NOT NULL, PRIMARY KEY CLUSTERED (fx, n));

go

/* populate */

INSERT INTO dbo.graph

(

       n

      ,xn

      ,fx

)

SELECT

       n 1 /* n runs from 2 to 19 to prevent 0 return values in the log functions */

      ,

            CASE fx

                  WHEN ‘CONSTANT’ THEN 1

                  WHEN ‘NATURAL_LOGARITHM’ THEN LOG(n)

                  WHEN ‘LINEAR’ THEN n

                  WHEN ‘NATURAL_LINEARITHMIC’ THEN (n * LOG(n))

                  WHEN ‘QUADRATIC’ THEN POWER(n, 2)

                  WHEN ‘CUBIC’ THEN POWER(n,3)

                  WHEN ‘THREE_EXPONENTIAL’ THEN POWER(3, n)

            END

      AS xn

      ,fx

FROM

(

 SELECT

  (one._bit * POWER(2, 0)) + (two._bit * POWER(2, 1)) + (three._bit * POWER(2, 2)) + (four._bit * POWER(2, 3)) + (five._bit * POWER(2, 4)) AS n

 FROM

  (SELECT 0 AS _bit UNION ALL SELECT 1) AS one

 CROSS JOIN

  (SELECT 0 AS _bit UNION ALL SELECT 1) AS two

 CROSS JOIN

  (SELECT 0 AS _bit UNION ALL SELECT 1) AS three

 CROSS JOIN

  (SELECT 0 AS _bit UNION ALL SELECT 1) AS four

 CROSS JOIN

  (SELECT 0 AS _bit UNION ALL SELECT 1) AS five

 WHERE

  (one._bit * POWER(2, 0)) + (two._bit * POWER(2, 1)) + (three._bit * POWER(2, 2)) + (four._bit * POWER(2, 3)) + (five._bit * POWER(2, 4)) BETWEEN 2 AND 19

) AS eighteen

CROSS JOIN

(

 SELECT ‘CONSTANT’ AS fx UNION ALL SELECT ‘NATURAL_LOGARITHM’ UNION ALL SELECT ‘LINEAR’ UNION ALL SELECT ‘NATURAL_LINEARITHMIC’ UNION ALL SELECT ‘QUADRATIC’ UNION ALL SELECT ‘CUBIC’ UNION ALL SELECT ‘THREE_EXPONENTIAL’

) AS graph

ORDER BY

       fx

      ,n;

go

/* query */

SELECT

       fx

      ,dbo.GuessAlgorithm(xn)

FROM

      dbo.graph

GROUP BY

      fx;

go

Advertisements

About Robert Matthew Cook

Hello, my name is Robert Matthew Cook. This blog is autoposted to from my main blog at www.sqlmashup.com. For more profile information or to leave a comment, please visit me there.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s