Read Data from Microsoft SQL Server/ Azure SQL Server in Databricks

Databricks provide a long list of library to read different types of data files or data sources. In similar approach it provides JDBC Drivers to read data from Microsoft SQL Server and Azure SQL Server.

Follow below steps to connect to SQL server and read data from SQL Server.

Step 1: Load User Name and password from Azure Key vault

Use below commands to load user name and password form Azure Key Vault.

jdbcUsername = dbutils.secrets.get(scope = "jdbc", key = "username")
jdbcPassword = dbutils.secrets.get(scope = "jdbc", key = "password")

Step 2: Make JDBC URL and Connection properties

Next we need to make JDBC url and connection properties to make connection to SQL server.

HostName = "your sql server host name"
Port = "sql server port Default port is 1433"
DatabaseName = "Your Database Name"

#now create jdbc url for the server and Database
jdbc_url = "jdbc:sqlserver://{0}:{1};database={2}".format(HostName, Port, DatabaseName"

ConnectionProperties = {
    "user": jdbcUsername 
    "password" : jdbcPassword 
    "driver" : "com.microsoft.sqlserver.jdbc.SQLServereDriver"
}

Step 3: Read Data from SQL Server

Now our JDBC url and connection properties are ready, we can use this information to read data from sql server

df = spark.read.jdbc(url=jdbc_url, table="employee", properties=ConnectionProperties)

df.show()

Read Data using query

In the previous example we learned that how we can read data from employee table however in some cases you may wish to read data using sql query. For example you may wish to filter data for “HR” department only.

To read data using query we can do as below

sql = "(select * from employee where department='HR') as hr_employee"

df = spark.read.jdbc(url=jdbc_url, table=sql, properties=ConnectionProperties)

df.show()

This method will load data from sql for only HR department employee from employee table.

Now you have learned how we can write code in Databricks to to read data from SQL server and store them into Dataframe. You can modify the code as per your need and get the output as you want.

Please feel free to send a message if you have any question. Also please leave a message if you want me to write on any other topic you need.

Read other topics

How to connect to Azure SQL server from Data Bricks using jdbc

Read complex/nested xml file in Data Bricks

Read Excel files in DataBricks Spark Sql

How to create temporary table in Azure Databricks SQL