Relationships

Generating the shape of your data's network is more important than generating the shape of each record.

People have friends. Services have customers. Organizations have members. In short, records are related to each other. In fact, GraphGen actually generates the complete network of relationships between records before it generates the values for the records themselves.

Like "normal" fields, Relationships are declared in GraphGen are declared using natural GraphQL syntax. The following illustrates a relationship between people and the businesses that employ them (assuming a 5% unemployment rate):

type Person {
  name: String!
  employer: Business @has(chance: .95)
}

type Business {
  name: String!
}

Now, when we create a Person, GraphGen will also create an Employer for them automatically.

let person = graphgen.create("Person");
person.employer.name; //=> Acme Corp, LLC

One to Many

While the Person type in our previous example had a single Employer, we can also approach the relationhsip from the opposit side. Instead of A person having an employer, we could also say that an employer has many employees. We express this using GraphQL list syntax

type Business {
  employees: [Person]
}

type Person {
  name: String!
}

💡Non-null modifiers do not have any effect on list fields.

Now when you create a Business, it will have a related list of employees:

let business = graphgen.create("Business");
business.employees.length; //=> 5
business.employees[0].name; //=> Bob Martin

Controlling the size of relationships

A business with one thousand employees is common-place, but a person with one thousand children would definitely raise some eyebrows. There's no one-size-fits-all, which is why GraphGen allows you to customize how many related records to generate. To do this, it uses a probabilistic approach. We can use the @size() directive to say that "the average business" has at least 1000 employees, and then graphgen will select a number of employees that feels right given those constraints.

type Busines {
  employees: [Person] @size(mean: 1000, min: 1)
}

💡the @size() directive can only be applied to lists of related types.

Given this information, grahgen will generate for you businesses with 900 employees. Others will have 1200, and still others may have as few as five, but the majority will be centered around the parameters you set.

For a full description, see the reference for the @size directive.

Referential Integrity

If you're using graphgen to serve data from a simulated service, then it's not enough that a Business has a list of Employees, and that a Person has an employer. Those records should point to each other and be the same actual data structure in memory. GraphGen lets you do this with the @inverse directive. Here's the previous example that "links up" the Business and Employee records:

type Person {
  employer: Business @has(chance: .95)
}

type Business {
  employees: [Person] @inverse(of: "Person.employer") @size(mean: 1000, min: 1)
}

When GraphGen creates a Person with an employer, that person will automatically be listed inside its employees. By the same token, when a business is created, each of its employees will have a back-reference to it.

let person = graphgen.create("Person");
person.employer.employees.includes(person) //=> true

let business = graphgen.create("Business");
for (let employee of busines.employees) {
  assert(employee.employer === business);
}

Coping with Cyclic Graphs

One problem that immediately arises the moment you have a self-referential type is the problem of cyclic graphs. For example, suppose we want to model the data inside of a social network. As we know, the entire point is to relate people to people. We might want to model it like this:

type Person {
  friends: [Person]
}

However this also presents a problem. If we create a Person, which creates 10 friends who are also people so they in turn create 10 more friends, the graph quickly explodes into the infinite. In reality, we can have massively interconnected networks of people because the networks turn back in on themselves and multiple people will be friends with the same person, so if we trace all of the relationships, it make take awhile, but they will ultimately be finite.

GraphGen simulates this with the concept of "affinity." The affinity of a relationship as it pertains to GraphGen is the probability that any two nodes in a population are connected by that relationship.

When generating a graph, GraphGen will use the affinity of a relationship to decide how ofter to reuse existing records, or to create a new one from scratch. In our previous example of a social network, let's say that the average user has about 10 frientds, and that the probability that any two users are connected as friends is 10%. We would represent this like so:

type Person {
  friends: [Person] @affinity(of: 0.1) @size(mean: 10)
}

This means that if we create two people, the chance is 1 in 10 that they will be friends, but as we create more and more people, the likelihood that some of them are connected to an existing person grows from a relative unlikelihood into a near mathematical certainty. As this happens, GraphGen will end begin re-using records with increasing frequency until the graph of records converges.