Home OSS About Privacy

Using the MS Graph API to Get Object Ids for Azure Purview

When creating entities or glossary terms in Azure Purview you will likely want to add an expert, owner, or steward. Unfortunately, Azure Purview doesn't "speak" in terms of email addresses, it requires that you provide Azure Active Directory Object Ids for individual users.

If you've never retrieved an AAD Object Id before, PyApacheAtlas has a simplified way of looking up a user principal name or primary email address and returning an object id.

Using PurviewClient to get Object Ids

Assuming you've installed and authenticated PyApacheAtlas and are using the PurviewClient you can access the msgraph property.

This references an MsGraphClient which has a handful of features:

Where can this be used?

You might manually call this function when you're programmatically creating entities:

# Assuming you've instantiated a PurviewClient
will_aad_id = client.msgraph.email_to_id('will@example.com')

ae = AtlasEntity(
    "my entity",
    "DataSet",
    "myEntityQN",
    contacts = {"Expert":[{"id":will_aad_id}]}
    guid="-1",
)

Alternatively, the function can be provided to the parse_bulk_entities method when you are working with excel spreadsheets. You would create an excel spreadsheet with a tab called BulkEntities and columns like the below:

typeName qualifiedName name experts
DataSet custom://my/custom/ds expert ds bill@example.com;will@example.com

The python script would require you to create an ExcelConfiguration and ExcelReader. You then pass the path to your spreadsheet AND The function you are going to use to convert the values in your experts or owners column into AAD Object Ids.

entities = reader.parse_bulk_entities(
    'path/to/spreadsheet.xlsx",
    contacts_func = client.msgraph.email_to_id
)

Note that you do not pass in client.msgraph.email_to_id() with parentheses but do pass in client.msgraph.email_to_id without parentheses. The difference is that the first is actually executing the function (on a Null value) while the second option is passing the function to be executed later on.

The function passed into contacts_func will get executed on every expert or owner provided. In this case, the PurviewClient.msgraph.email_to_id method will take an email address, query it against the Microsoft Graph API and then return the id. To avoid unnecessary calls to the function you provide, it has a built-in dictionary that stores the results for the given parsing.

Using Your Personal Azure CLI Credentials

The easiest way to gain access to the Azure Graph is to authenticate PyApacheAtlas with the AzureCLI credentials. For the most part, most users have permission to read all other user's basic info.

from azure.identity import AzureCliCredential
from pyapacheatlas.core import PurviewClient

cred = AzureCliCredential()
client = PurviewClient(
    account_name = "myPurviewAcountName",
    authentication=cred
)

aad_id = client.msgraph.email_to_id("someone@example.com")
print(aad_id)

Using Service Principal Credentials

In order to use the MS Graph functionality with a Service Principal, you must give the service principal the delegated permission to Query/Read All of the user's.

from pyapacheatlas.auth import ServicePrincipalAuthentication
from pyapacheatlas.core import PurviewClient

oauth = ServicePrincipalAuthentication(
        tenant_id="TENANT_ID"
        client_id="CLIENT_ID"
        client_secret="CLIENT_SECRET"
    )
client = PurviewClient(
    account_name = "myPurviewAcountName",
    authentication=oauth
)

aad_id = client.msgraph.email_to_id("someone@example.com")
print(aad_id)

Before you can execute the script below, you should make sure your service principal has permission to access the MS Graph and read user information. This does require an AAD admin to provide this permission.