How to Connect Azure Data Factory to an Azure SQL Database Using a Private Endpoint

Photo by Hafidh Satyanto on Unsplash

A step-by-step tutorial

Azure Data Factory (ADF) is great for extracting data from multiple sources, the most obvious of which may be Azure SQL. However, Azure SQL has a security option to deny public network access, which, if enabled, will prevent ADF from connecting without extra steps.

In this article, we’ll look at the steps required to set up a private endpoint and use it to connect to an Azure SQL database from Azure Data Factory.


‘Deny public network access’ setting in Azure SQL

Before we get started, let’s review which setting I’m referring to in Azure SQL. It’s a toggle named deny public network access found under Security > Firewalls and virtual networks in the Azure portal.

Source: author

When this setting is enabled, Azure Data Factory won’t connect without a private endpoint. You can see there’s even a link to create a private endpoint below the toggle control, but don’t use this now — we’ll create the request from Azure Data Factory in a minute.


ADF integration runtime

To use private endpoints in Azure Data Factory, you must use an integration runtime with virtual network configuration enabled. The setting cannot be changed, so you’ll need to create a new runtime if you don’t have one with it enabled already.

Source: author
Source: author

Now that you have an integration runtime with virtual network configuration enabled, you’re ready to create a new linked service.


ADF linked service

While still in Azure Data Factory, click to create a new linked service.

Source: author

When you select an integration runtime with virtual network configuration enabled, a managed private endpoint setting will appear in the account selection method section. The setting is read-only and will populate as you enter subscription and server details. If a managed private endpoint is already available — you’re good to go!

If a managed private endpoint isn’t available, click the create new link button to start the process.

Source: author

When you save the new managed private endpoint in Azure Data Factory, it will be provisioned in Azure but remain in a Pending status until approved.


Azure private endpoint

Now we need to hop back to Azure to approve the new private endpoint. Find your Azure SQL database in the Azure portal, and browse to Security > Private endpoint connections.

You should see the connection created by Azure Data Factory with the status Pending. Select its checkbox and click the Approve button.

Source: author

The status will change to Approved in the Azure portal. It takes a minute or two for the status to make its way to Azure Data Factory, but it will show as Approved there after a moment, too.

Source: author

Once it shows as approved, you’re ready to go. You can enter the rest of your connection info and connect!


Recap

Most of the settings I’ve shown can be accessed in several different ways and performed in different orders. For example, you could create the private endpoint from the Azure portal instead of through Azure Data Factory. You can obviously experiment and find the process that works for you.

The important pieces are the following:

  1. Azure Data Factory has an integration runtime with virtual network configuration enabled.
  2. Azure SQL has an approved private endpoint connection.
  3. Azure Data Factory has a linked service using the integration runtime and private endpoint connection.

That’s it — now go have fun with your new connection!


This article was originally published in Towards Data Science on January 20, 2021.


Interested in learning more about Azure Data Factory? Give these books a try. Note that I use affiliate links when linking to products on Amazon.

Unit Testing Stored Procedure Calls with Rhino Mocks

Database stored procedure calls are one of the trickiest things to unit test, and there are many different approaches that can be taken. My team has run the gamut: test DBs that rollback with each run, no testing for direct data access functions (!), virtual functions w/ partial mocks (see here).

The latest approach that I’ve been using is much more straightforward and feels like a more natural use of Rhino Mocks. Let’s look at some examples of how to test some common database stored procedure tasks. (Note that these examples assume use of the Microsoft Enterprise Library.)

Create a mockable Database

The primary challenge that I’ve found with testing database code is that Microsoft.Practices.EnterpriseLibrary.Data.Database isn’t mock-friendly. However, the other “pieces” such as DbCommand and DbCommandParameterCollection are very easy to work with. So, we can solve the Database problem by creating a simple wrapper (Important! Note that the methods have the virtual keyword, which will allow them to be overridden.):

public class DatabaseWrapper
{
    private Database _database;
    private Database Database
    {
        get { return _database = _database ?? DatabaseFactory.CreateDatabase(); }
        set { _database = value; }
    }

    public virtual DbCommand GetStoredProcCommand(string storedProcedureName)
    {
        return Database.GetStoredProcCommand(storedProcedureName);
    }

    public virtual void DiscoverParameters(DbCommand command)
    {
        Database.DiscoverParameters(command);
    }
}

Executing a stored procedure

Now that we are able to mock the database object, we can write some useful tests. Let’s say you want to execute a stored procedure named “MyStoredProcedure,” and you want to write a test to verify that your code handles an exception thrown when it’s executed. That’s very easy!

Here’s my class with the function I want to test:

public class MyDataAccess
{
    public DatabaseWrapper Database { get; set; }
    public Thingy GetThingy()
    {
        Thingy thingy = null;
        try
        {
            var dbCommand = Database.GetStoredProcCommand("MyStoredProcedure");
            Database.DiscoverParameters(dbCommand);
            var result = dbCommand.ExecuteNonQuery();
            // populate thingy
        }
        catch (Exception ex)
        {
            // handle exception
        }
        return thingy;
    }
}

And here’s my test that will throw an exception when the stored procedure is executed. I create my DatabaseWrapper as a PartialMock, allowing me to override its methods.

[TestMethod]
public void GetThingyHandlesException()
{
    // Arrange
    var target = new MyDataAccess();
    var mockDatabase = MockRepository.GeneratePartialMock<DatabaseWrapper>();
    target.Database = mockDatabase;

    // mock Database
    const string storedProc = "MyStoredProcedure";
    var mockDbCommand = MockRepository.GenerateMock<DbCommand>();
    mockDatabase.Expect(x => x.GetStoredProcCommand(storedProc))
        .Return(mockDbCommand);
    mockDatabase.Expect(x => x.DiscoverParameters(mockDbCommand));
    
    // mock DbCommand
    var ex = new Exception("Oh noes!");
    mockDbCommand.Expect(x => x.ExecuteNonQuery())
        .Throw(ex);

    // Act
    var actual = target.GetThingy();

    // Assert
    mockDatabase.VerifyAllExpectations();
    mockDbCommand.VerifyAllExpectations();
    Assert.IsNull(actual);
}

Setting input parameters

Need to set some input parameters? No problem!

dbCommand.Parameters["@id"].Value = id;

And, in your test, you add this:

var mockParams = MockRepository.GenerateMock<DbParameterCollection>();
var mockParam = MockRepository.GenerateMock<DbParameter>();

mockDbCommand.Expect(x => x.Parameters).Return(mockParams);

mockParams.Expect(x => x["@id"]).Return(mockParam);

const int id = 123;
mockParam.Expect(x => x.Value = id);

mockParams.VerifyAllExpectations();
mockParam.VerifyAllExpectations();

Reading output parameters

How about output parameters?

thingy.Value = dbCommand.Parameters["@Value"].Value as string;

Add the additional mocks and assertions:

var mockOutParam = MockRepository.GenerateMock<DbParameter>();

mockParams.Expect(x => x["@Value"]).Return(mockOutParam);

const string value = "MyValue";
mockOutParam.Expect(x => x.Value).Return(value);

mockParams.VerifyAllExpectations();
mockOutParam.VerifyAllExpectations();
Assert.AreEqual(value, actual.Value);

Working with sets of parameters

When you have more than a few parameters to work with, the unit test code can get quite lengthy. I like to keep it clean by extracting the duplicated logic into a separate function, like so:

var paramsToVerify = new List<DbParameter>();
mockParams.Expect(x => x["@whammyparammy"])
    .Return(MockParameter<int>(paramsToVerify));

My function allows you to specify and verify the type of each parameter, but you could easily modify it to expect a specific value.

private static DbParameter MockParameter<T>(List<DbParameter> paramsCollection)
{
    // set Expect with Arg<T>.Is.TypeOf to force the specific type
    var mockParam = MockRepository.GenerateMock<DbParameter>();
    mockParam.Expect(x => x.Value = Arg<T>.Is.TypeOf);

    if (paramsCollection != null)
        paramsCollection.Add(mockParam);

    return mockParam;
}

I keep the parameters in a list so I can verify them during my assertions.

paramsToVerify.ForEach(x => x.VerifyAllExpectations());