Long read: Modelling Identity in Enterprise Architecture / ArchiMate

While I regularly inspect ICT architecture models (both as a teacher and as a consultant) I don’t often see concepts for “identities” and “accounts” present in those models. Authentication and authorization (A&A) may be implied in those models, usually by connecting the system or application with an identity provider such as a directory service, but invariably details are lacking. There’s the application, and there’s the consumers of the application (like users). The consumers just consume, A&A simply assumed present and working properly. If any attention is given at all to identity or the bearer thereof, then more often than not the descriptions are incomplete or downright sloppy.

However, when you’re working on an identity & access management (IAM) project, there’s no way around the topics of identity, authentication and authorisation. That’s because the whole issue of IAM solutions is to manipulate and manage identities and accounts. Also, when dealing with security, authentication is almost certain to come up – and that’s of course directly connected to identity. So what is an identity, and how would we model it in ArchiMate?

I wasn’t able to find a good example (or any example at all) of an ArchiMate model that contains both accounts and identities. That doesn’t mean there aren’t any, just that I couldn’t find them. I couldn’t even find a suitable model for the concept of identities. So I decided to go the drawing board myself, trying to consolidate what’s available in literature and standards. And then I discovered that even the concepts of accounts and identities aren’t that well defined. Going further and further down the rabbit hole, I attempted to come to a workable set of definitions and modelling conventions for identity, authentication, and authorisation. In this document I’ll cover the first, arguably hardest part: identity. In the future I’ll have to tackle credentials and access rights.

Common conception of identities and accounts

Let’s first start with a simple model, that captures the essence of what is a quite common way of thinking about accounts and identities:

Figure 1: Oversimplified model for identity

This model contains the actual person, represented by ArchiMate concept Actor, and labelled “Person”. An IT account for this person could be modelled using an ArchiMate Data object, which –unsurprisingly– has been labelled Account. For good measure, we throw in the person’s Job title: modelled as an ArchiMate business role, as one would expect. Note how in this model, the Account itself is distinct from the authorizations that would be granted to a person based on their role in the organization. In other words: the Account solely represents the Person’s identity as it is contained in some IT system, like Novell eDirectory or Microsoft Active Directory, or like the user database of an online forum.

Alternatively, we could associate the Account with the Job title, but people mostly feel their account goes with themselves as persons, not with their actual jobs in the organization. Therefore, they log into their account using their name, not their job title. “John, the server admin” is “John”, not so much “the server admin”. This seems apt, since getting a promotion doesn’t usually mean receiving a new computer account.

There are issues with the model presented above: the account just sits there, directly connected with the Person, without any indication where it is and isn’t valid. Model-wise the account for John Doe could be just as valid at Company X as in the local library. The scope within which the Account can be used, can be recognized, is not part of this simple model. It’s true that in this way of modelling, one could add multiple accounts to the Person, but all reasoning about context and identity that leads to such a decision would be implicit. And the presence of Job title actually muddies the account-waters: there can be interactions with the IT systems that are not instigated by employees or members of the organization, but by end users, external users, IT systems, and the like. Finally, note the weak relation between Person and Account: association. That’s because there isn’t a direct relation in the ArchiMate metamodel between business layer active structure component “actor” and application layer passive structure component “data object”. An indirect relation “access” could be modelled, but that would suggest that the actor was supporting an application (left out of the view) that in turn accesses the account – not the kind of “access” we’d mean.

All in all, the model in Figure 1 is clearly oversimplified. What’s needed is definition, detail and intention, and the next sections will attempt to add these.

Identity is associated with a subject

We’ll start with a closer look at the entity that represents an identity: the Person in Figure 1. Immediately we find a shortcoming: an important aspect of identity is that it isn’t limited to “users”, i.e. persons; a subject could also be some other entity that needs to be identified, for instance an authorized system like a web server. An ICT system can interact with digital services, in the form of machine-to-machine communications, which would require that system to be properly identified. Clearly, an IT system such as a web server is not a person.

In the literature concerned with Identity & Access Management, this is readily recognized. Here, usually the term “Subject” is being used. A definition for Subject given in NIST special publication 800-63-3 (Grassi, Garcia, & Fenton, 2017) is: “A person, organization, device, hardware, network, software, or service.” I would like to note that in this definition, device and hardware appear to be synonyms, and a network could either be seen as a service, or as a type of device. Also, I believe a group could also be something that has an identity, and “organization” doesn’t fully cover that possibility. So developing from the NIST definition, I posit the following definition for “Subject” in the context of Identity & Access Management:

Subject:
A person, device, system (in it’s broadest sense) or an entity (a legal entity, a group, or any other collective that may act as a unit).

Let’s express this in ArchiMate terms:

The concept of a "Subject" — Figure 2: The concept of a “Subject”

The Subject is a concept of type Actor, but it can be specialized into Person, Entity, System or Device. All are Business actors, as all are entities capable of performing behaviour.

A definition of “Identity”?

Now let’s move to the topic at hand: identity. When talking about it, at first it might seem everybody understands exactly what we mean by the term. However, in practice people seem to have quite divergent understandings of what an identity actually is. Terms like “identity”, “account”, “login name”, “identifier” and “subject” (or “person”) may be used in varying combinations and with different interrelations, or even as synonyms. This shows the need for good definitions.

At this point, one might then expect I would make an attempt to define the concept of a real-world “Identity”, but I can’t. This is because the very idea of “identity” presents a really big problem, one that philosophers have been mulling on for centuries. An article like Noonan & Curtis (2018) can show you just how complex the topic can be. As far as I can understand these things, identity is something that describes how a symbol or descriptor (like a “name”) relates to its referent (like a “person”).

Fortunately, for modelling identity within enterprise architecture we can choose to forego the step of defining what a real-world identity precisely is, because for the issues that we tackle in enterprise architecture, we’re dealing with entities that mostly manifest themselves in a digital sense: “digital identities”. It’s been argued that the world is experiencing a shift from physical identities to digital identities (e.g. McWaters, 2016). Ben Ayed (2014) states “Digital identity is considered as an intersection of identity and technology in the digital age.” And even the Dutch National Office for Identity Data consider the identities that they themselves manage to be digital ones (see NORA, n.d., in Dutch). So let’s focus on the identities that people have in the digital world.

A definition of “Digital identity” then…

First off, let’s find some examples, as we might be able to derive some relevant aspects from those:

Person John has two different digital identities:
- Employee at company X. The way John is identified in the company’s information systems is by means of John’s login name “John.Doe@companyX.com”;
- Book lender at the local library. The library records John’s activities under the library member number “A123456”, and John uses this number as his login name in the library’s online book reservation system.
Person Jane has two identities at the same educational organization, university U:
- Regular employee. The university’s systems know her by her login name “Jane Doe”;
- Teacher, with access to sensitive systems like the grades database. This requires a different information systems account because of the specific, high-privilege access rights. This second account has login name “T-Jane Doe”;
Physical server brand A, model type B, serial number <serial_number_C> has many identities, among which:
- Web server for organization X. It’s DNS name is “www.organizationX.org”;
- Web server for company Y. It’s DNS name is “www.companyY.com”;
- Mail server for company Y. It’s DNS name is “mail.companyY.com”.

The examples already provide us with some important clues. John’s example shows that one person will usually have multiple digital identities, each of them valid in a different context. And Jane’s example shows that this context doesn’t just cover an organization’s scope (or she would only have one identity). Furthermore, for each digital identity there’s some sort of single symbol, labelled in the examples as “name”, which can serve to identify the person or entity. And finally, the web server example confirms that identities are not just for persons, but potentially also for other entities. If we think about that for a little, we’ll recognize that this “name” needs to be unique, at least within the context within which the identity is being used.

Now let’s see if we can find some guidance from reputable sources:

NIST special publication 800-63-3 (Grassi, Garcia, & Fenton, 2017) provides this definition: “Identity: An attribute or set of attributes that uniquely describe a subject within a given context”. Elsewhere, it says “[a] digital identity is the unique representation of a subject engaged in an online transaction. A digital identity is always unique in the context of a digital service, but does not necessarily need to uniquely identify the subject in all contexts.”
Windley (2008) has something similar to say: “A digital identity contains data that uniquely describes a person or thing (called the subject or entity in the language of digital identity) but also contains information about the subject’s relationships to other entities.”
Ben Ayed (2014) shows that “Thus, identity is defined as a collection of data about subject that represent attributes, preferences, and traits, so in parallel, in the digital world a person’s identity is typically referred to as their digital identity.”

We can combine the definitions and observations from these sources to obtain a workable definition:

Digital identity:
A unique representation of a subject interacting with digital services. It entails data (an identity attribute, or set thereof) which uniquely describes that subject within a given context (although it needn’t uniquely describe that subject in all contexts).

Identity attributes make up a digital identity

The definitions for Subject and for Digital identity, when taken together, support the notion that any subject can be uniquely identified, using an identity attribute, or set thereof. Note, however, that this requiresanother definition: that of the identity attribute. NIST special publication 800-63-3 (Grassi, Garcia, & Fenton, 2017) has this to say: “Attribute: a quality or characteristic ascribed to someone or something”. That’s not specific enough, but we can augment this by inserting “subject” instead of “someone or something”, and add the idea of identification. The result is the following:

Identity attribute:
A quality or characteristic ascribed to a subject, and suitable to distinguish an individual subject from other subjects within a given context.

Note that an identity attribute needn’t uniquely identify a subject on its own, like maybe a passport number or DNA profile does. My surname can by itself serve as an identity attribute, as well as my given name, birth date, eye colour, home address and cell phone number. Any of these attributes could be used to uniquely identify “me” in a particular context, either by themselves (my passport number while in the airport, my given name while sitting in a meeting) or in a small set (the combination of my last name and birth date when phoning with my doctor’s office; the combination of my last name and home address when contacting my plumber).

But wait a minute! The definition for Identity attribute says it’s a quality ascribed to a subject. So it needn’t be the case that the attribute is inherited from the subject. And indeed, as the examples show, while some identity attributes are inherent to the subject (eye colour, other biometrics) many are not: a passport number is ascribed by a nation’s government; a login is assigned by some sort of account registration process. So if we’re going to model identity attributes, we probably also need some sort of source for those attributes that are not inherent to the subject. But let’s just park this observation until after we’ve covered another aspect of identities and identity attributes: context.

A digital identity is valid in a limited scope

Both in the definition of Digital Identity and of Identity attribute given above, there’s one term we haven’t explored yet, and that’s the term “context”. It seems we must be able to recognize and model the scope for sets of identities and for identity attributes, as indicated by the term “context”. So how would we go about defining scope/context for identity?

First let’s note that such a context can be modelled using the group concept; modelled in this fashion, it acts as a “logical place”, which seems fitting since collections of ICT facilities are not uncommonly referred to as “landscapes”. Also, ArchiMate 3 allows for any and all concept to be grouped under a group concept using the aggregation relation, which also fits nicely with the idea of an identity context. Services, components, data objects: each of these could be part of one single identity context, within which John Doe’s account serves to uniquely refer to John. Thus, John could use his account to interact with a service, a component, a data object in a specific identity context using a single digital identity, and the results of his interaction could be linked to John’s digital identity.

One could propose that the Location concept might better serve the purpose of modelling a “logical place”; the specification (Open Group, 2017) relationship table in appendix B certainly shows composition and aggregation is allowed from Location to everything else. However, remember that its ArchiMate definition is “A location is a place or position where structure elements can be located or behaviour can be performed”. This definition doesn’t line up fully with the idea that the identity context is based on meaning and information. So “group” is a better concept, in ArchiMate anyway.

Now we’re ready for the definition. Let me put forward this:

Identity context:
A logical grouping of information system concepts, among which a particular set of digital identities is generally recognized and their contained identity attributes are meaningful.

And this is where we have to return to the observation parked in the previous section: identity attributes may be ascribed, instead of inherited from the subject. But where are these attributes then coming from?

To this end I would like to put forward this: a subject’s identity within an identity context is closely related to that subject’s interacting with the information systems within that context. And therefore that subject’s identity attributes are as well. Now if we investigate the interaction between subject and information systems, then we recognize that this interaction is usually explicitly designed: the information systems serve the subject, in a way that support that subject within the identity concept. And how this serving should take form can be captured by the role that the subject is playing in their interaction with the information systems. Thus, we need a concept that can represent that role; the label could be something like “Interaction role”. A definition for this concept:

Interaction role:
The responsibility that a subject has, or role that a subject plays, in the interaction with the information systems in a single identity context.

A single subject could be assigned to more than one of these, as the subject may have more than one interaction, and these may be in different roles and/or in different sets of information systems. For example, a subject may have the Interaction role of “employee” within the information systems context of their company. But a subject may also have the Interaction role of “customer”, or even “prospect”, in which case no formal relation exists (yet) between subject and the owners of the information systems within the Identity context.

In ArchiMate it seems reasonable to model Interaction role using the business role concept, as that ArchiMate concept is defined as “… the responsibility for performing specific behaviour, to which an actor can be assigned, or the part an actor plays in a particular action or event”. If we were modelling employees and other members of an organization, then these interaction roles could be closely related to the Job title that we saw in the simplified model.

Putting the concepts in an ArchiMate model

With the different concepts of digital identity clearly defined, we can add them to our prior ArchiMate model, as presented in the figure below:

Figure 3: Digital identity, modelled in ArchiMate

We’re starting from the Subject from Figure 2, and assign it to an Interaction role within an Identity context. This Interaction role accesses Digital identity, a business object composed of one or more Identity attributes. The access relation is used (without an arrowhead to indicate either read or write) since Identity attributes in the Digital identity may obtain their values from the Interaction role. Furthermore, the Digital identity is also directly accessed by the Subject, since Identity attributes in the Digital identity may obtain their values directly from the subject. The Digital identity and Interaction roleare composed by the Identity context; the Digital identity is recognized within this context as uniquely identifying the Subject.

Using digital identities in ICT landscapes

Looking back at the example identities listed before, it’s easy to recognize that the digital identities that John and Jane have at the mentioned organizations are recorded in the ICT systems in the form of “accounts”, and given what we’ve defined for digital identities as abstract concepts, the definition of the more concrete concept “account” is relatively easy:

Account:
The unique technical representation of a digital identity within an information system context, which contains at a minimum those identity attributes that are both relevant to all information systems within that context, and correspond to the represented digital identity.

When creating a model for an Account, this definition can guide us in several ways. Since the Account is a technical representation of something that we’ve modelled as a Business object, it stands to reason that Account is a Data object, and it’s related to Digital identity by means of the realization relation. And since the account must “contain” the values of the identity attributes of the Digital identity, it must be composed of a number of “Account attributes”, some of which must realize all the different (relevant) identity attributes that define the Digital identity.

And these observations help us to augment the model in figure 3, so that we arrive at the following model:

Figure 4: Relating Digital identity to Account

Two examples

So let’s see if the model we’ve arrived at can actually serve the examples we started out with. To this end, we look at John Doe’s Windows account with his organization X. The idea of a Windows account can be modelled as follows:

Figure 5: A Windows account for an employee

Person and Digital identity have been taken directly from Figure 4. The business role object “Employee” captures the Employee information that is contained in the Digital identity; it’s the Interaction role for the context of, but stems not directly from the subject. There’s the Identity attribute “Employee name” within the Digital identity that inherits it’s value directly from the Person, and “Employee number” that is attributed a value based on the Person’s interaction role as an Employee. The Account object “Windows account” contains two different account attributes, one of which is unique and derived from the Digital identity (and the other one, objectGUID, entirely technical). Also, I’ve added an application service named Windows IdP (most likely realized by an application component like Microsoft Active Directory) since that’s the application behaviour that actually manages (creates, reads) the account – IdP stands for Identity Provider.

If we’d want to show John Doe’s data in here, then Person = John Doe, and the two digital identity attributes would be Employee name = “John Doe” (inherent to the person), Employee number = “12345” (ascribed to the person by the Human Resource department). The Windows account would have attributes UPN = “John.Doe@companyX.com” (derived from the identity attribute Employee name) and objectGUID = “{9f95412b-95f7-4807-7f0a-f3138e5d35e1}” (set by the Identity provider).

Let’s do another one: the John-as-library-member example. John’s identity in the library’s system could be modelled as follows:

Person, Digital identity, Lender and Lender account have again been taken directly from Figure 4. In this model, the application service Lender webapp is the application making use of accounts, which are labelled Lender account.

If we’d want to show John Doe’s data in here, then again Person = John Doe, while the two digital identity attributes would be Name = “J. Doe” (inherent to the person), Member = “A123456” (ascribed to John when he became a member). The Lender account here has only a single identity attribute, Login = “A123456”.

In conclusion

In this post, six definitions have been developed: Subject, Digital identity, Identity attribute, Identity context, Interaction role, and Account. These definitions have been shown to be suitable to tackle the modelling of (digital) identity. The definitions have been shown to work by creating general ArchiMate models, as well as two examples. Topics that will have to be addressed in a different long read post are credentials and access rights.

References

Ben Ayed, G. (2014). Architecting User-Centric Privacy-as-a-Set-of-Services. Switzerland: Springer.

Grassi, P. A., Garcia, M. E., & Fenton, J. L. (2017). NIST SP800-63-3 Digital Identity Guidelines (Special Publication No. SP800-63–3). NIST. Retrieved from https://doi.org/10.6028/NIST.SP.800-63-3

McWaters, R. J. (2016, August). A Blueprint for Digital Identity. Consulted on November 28^th, 2018, from http://www3.weforum.org/docs/WEF_A_Blueprint_for_Digital_Identity.pdf

Microsoft (2018) User Naming Attributes. Consulted on December 1^st, 2018, https://docs.microsoft.com/en-gb/windows/desktop/AD/naming-properties

Noonan, H., & Curtis, B. (2018). Identity. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Summer 2018). Metaphysics Research Lab, Stanford University. Retrieved from https://plato.stanford.edu/archives/sum2018/entries/identity/

NORA (n.d.). Identiteitenbeheer. Consulted on November 28^th, 2018, from https://www.noraonline.nl/wiki/Identiteitenbeheer.

Open Group (2017). ArchiMate® 3.0.1 Specification. Zaltbommel: Van Haren.

Windley, P. J. (2008). Digital Identity. Sebastopol: O’Reilly Media.

Recognition

Special thanks to reviewers JB Sarrodie (chair for the Open Group ArchiMate forum), Rob Post (architect at the Dutch National Office for Identity Data). Also thanks to reviewers Gerben Wierda (of Mastering ArchiMate fame) and Bob te Riele (also architect at the Dutch National Office for Identity Data).

6 thoughts on “Long read: Modelling Identity in Enterprise Architecture / ArchiMate”

Mark Paauwe says:

10th December 2018 at 13:18

Jan, great effort. I am not an expert in this field, so I may have some stupid questions for you, I just like to understand your effort. My question are: why don’t you refer you RBAC and ABAC? Why should IDENTITY literally be modelled? Why don’t your refer to this paper? : https://ai2-s2-public.s3.amazonaws.com/figures/2017-08-08/9a5643d19a14b3e32ae2bdeaa7d859736b3454c5/3-Figure2-1.png https://www.semanticscholar.org/paper/Modeling-Access-Control-Transactions-in-Enterprise-Gaaloul-Guerreiro/9a5643d19a14b3e32ae2bdeaa7d859736b3454c5
In this presentation also RBAC is mentioned as IAM paradigm: https://www.slideshare.net/AlainHuet2/infosafe-ah-iam-2013-26270185

When I write scientific articles, I always start with: scholar.google.com
https://scholar.google.nl/scholar?hl=en&as_sdt=0%2C5&q=archimate+rbac&btnG=
1. Jan Schoonderbeek says:
  
  10th December 2018 at 20:17
  
  Mark, questions are never stupid. (Not asking questions may be.) I don’t refer to RBAC/ABAC because that’s access management, and the conclusion states I’ll address that in another paper (will be another long read). The Gaaloul/Guerreiro/Proper paper is already in my library, but it’s also access management – also it doesn’t cover identity, it directly ties the permissions to “users”. This overlooks the identity context. Thanks for the AlainHuet reference; I’ll investigate.
Pavel Janovjak says:

18th December 2018 at 13:33

Jan, I like this article very much, a perfect inspiration in the right moment.
Thanks!
Till Affolter says:

22nd January 2019 at 13:15

Jan, this is truly a very helpful description of the various concepts around (digital) identities! It really helped me to establish a common understanding of the concepts at my employer.
Looking forward to your blog posts about credentials and access.

Thanks for sharing this with us.
Till
Hans Bot says:

22nd March 2019 at 09:33

Jan, like you, I’ve experienced Identity and Entitlement Management as a particularly tricky topic to understand and model. If only all those different standards would align their terms and definitions… So thank you for clarifying the business terminology.
However, there is a subdomain I feel is missing in this discussion. It’s a concept I’ve learned to call ‘involvement role’. Actually, in many business processes, actors have authorizations based on an assigned involvement with a particular case – researcher, assessor, reviewer, operator, etc. A person can have different involvement with different cases (think: medicine, legal, projects). However, there are restrictions, the person must be qualified to be assigned an involvement role. Qualification may be derived from a business role, but often personal skills (e.g. language) and (valid) certifications also play a role.
Having a “person” identified by some identity “involved” in a “case” is easy to capture in Archimate. I tend to model the “Identity” object on the assignment relation between a “Subject” actor and the “Involvement Role” business interaction (a bit of a stretch – “a unit of behaviour performed as a collaboration between two” identities, so you could also opt for a business role, although this cannot be part of a business process), which is part of a “Case”, which is a specialized business process. Nowadays, you can even include the qualification in ArchiMate, and have a person realize this qualification. However, I struggle to find a way to express the condition here (only qualified person shall be assigned to involvement role). Any thoughts?
Thanks, Hans.
1. Jan Schoonderbeek says:
  
  22nd March 2019 at 22:06
  
  Hi Hans,
  There’s a lot to unpack in your question. Nothing like it has been covered yet, because it goes way further than identity only, into authorization and past that. Not something I can easily address fully in a succinct reply. That said, I could throw together these observations:
  I don’t think the business collaboration concept is a particular good fit for what you want to model – and maybe neither is the role. Perhaps an expert is a business actor “subject” associated with a capability “skillset”. If I group these two, then the group “Expert” can serve the business process “Case”. There can be a requiremet “Skill required” associated from “Case”, and it’s the capability “Skillset” that realizes it.
  Not really pretty yet, but ArchiMate primarily serves to model structure and behaviour of some definite (and thus unconditional) reality. Unsatisfactory I know, but hit me on LinkedIn and we can discuss this more in depth.

Comments are closed.