Python developers, before you start rolling your eyes, no, this is not a post about pyodbc.
What this post is about is flipping the script – how do we run Python within SQL Server ? And you may wonder why you might wish to do so…?
As you are all probably aware, digital transformation has been the corporate desire for a while now, together with a move to cloud computing.
And as organisations strive to find improved and varied ways to make their data pipelines more efficient and using a variety of tools including Python, the need for integrating these different approaches in a coherent, manageable and resource-friendly way is essential.
SQL Server Machine Learning Services
Since SQL Server 2016, it was possible to run non-SQL scripting languages within SQL Server with the introduction of r services.
With SQL Server 2017 onwards, Microsoft extended the capability by enabling the execution of Python scripts as well, with SQL Machine Learning Services and SQL Machine Learning Server (SQL Enterprise only).
Microsoft’s reference site docs.microsoft.com has an excellent primer, with many examples and how-to links, describing What is SQL Server Machine Learning Services with Python and R?
The key stored procedure that developers need to understand is sp execute external script which allows Python code to be embedded within SQL Server.
I’m not going into any detail about this stored procedure, but there are some things you need to bear in mind when working with this Machine Learning Services process.
It’s not quite lift-and-shift!
- Test your Python code in Python before embedding in SQL Server – obvious yes, but worth stating explicitly.
- Python code indentation requirements are the same when embedded in SQL Machine Learning Services.
- Quote marks – if like me you prefer to use single quotes in your python code for literal or parameter values, you’ll need to double these up since the python script is provided to sp execute external script as unicode text. Either that or just use double quotes.
Depending on which edition of SQL Server you are using, your perfectly succinct Python script, which runs like a dream in native
Python, may not be supported in SQL Machine Learning Services.
For example, if you are using SQL Server 2017 Standard Edition, then the base build of Python in Machine Learning Services is
3.5.2 (with pandas only 0.19)!
This would be pretty frustrating and off-putting bearing in mind Python is now currently up to version 3.9.1.
It is possible to sync SQL Machine Learning Services to the latest builds of Python, pandas etc., as well as third-party Python
packages, but this requires SQL Machine Learning Server.
SQL Server Enterprise Edition is required to install SQL Server Machine Learning Server, which is necessary to bind it to Machine Learning Services.
More recently, thankfully, there have been updates which help simplify the process:
- If you have installed CU22 or later for SQL Server 2017 then follow this process.
- For installing packages using standard Python Tools, for SQL Server 2017 only, follow this process.
As organisations adopt multiple approaches to developing data pipelines, using Python and other languages amongst others, SQL Machine Learning Services in an on-premises Data Platform can be a useful stepping stone to your digital transformation, and building data pipelines in the cloud.
Do you need help with your digital transformation?
Would you like to know more about how cloud services can help improve your organisational data strategy?
Then get in touch with risual at https://www.risual.com/contact/.