Association

Association decisions are about marking relationships between entities. A hyperlink is a very visible form of association between webpages. Algorithms exist to automatically create hyperlinks between pages that share some relationship on Wikipedia for instance. A related algorithmic decision involves grouping entities into clusters, in a sort of association en masse. Associations can also be prioritized, leading to a composite decision known as relevance. A search engine prioritizes the association of a set of webpages in response to a query that a user enters, outputting a ranked list of relevant pages to view.

Association decisions draw their power through both semantics and con- notative ability. Suppose you’re doing an investigation of doctors known to submit fraudulent insurance claims. Several doctors in your dataset have associations to known fraudsters (e.g., perhaps they worked together at some point in the past). This might suggest further scrutinizing those associated doctors, even if there’s no additional evidence to suggest they have actually done something wrong.

IBM sells a product called InfoSphere Identity Insight, which is used by various governmental social service management agencies to reduce fraud and help make decisions about resource allocation. The system is particularly good at entity analytics, building up context around people (entities) and then figuring out how they’re associated. One of the IBM white papers for the product points out a use case that highlights the power of associative algorithms.15 The scenario depicted is one in which a potential foster parent, Johnson Smith, is being evaluated. InfoSphere is able to associate him, through a shared address and phone number, with his brother, a convicted felon. The paper then renders judgment: “Based on this investigation, approving Johnson Smith as a foster parent is not recommended.” In this scenario the social worker would deny a person the chance to be a foster parent because he or she has a felon in the family. Is that right? In this case because the algorithm made the decision to associate the two entities, that association suggested a particular decision for the social worker.

Association algorithms are also built on criteria that define the association. An important metric that gets fed into many of these algorithms is a similarity function, which defines how precisely two things match according to the given association. When the similarity reaches a particular threshold value, the two things are said to have that association. Because of their relation to classification then, association decisions can also suffer the same kinds of false positive and false negative mistakes.

results matching ""

    No results matching ""