A Real-World Character Sheet Based on First Impressions

by Nathan O'Donnell


Posted on April 20th, 2021 at 12:00 PM



Introduction

When you first meet a person, you make assumptions based on their appearance, from very basic characteristics, such as their age or gender, to more advanced personality characteristics like their favourite colour, the car they might drive, a friend of yours who you could set them up with. We can make assumptions about anything really. Just from one look at a person, we can imagine how they would act in certain situations or what their life has been like up to this point. This is what this project is really all about, testing how strong our human intuition really is.

The aim of this project is to build a detailed character sheet when given a photo of a random person. As to how we get the details, we survey many people and ask them for the attributes of the people we show them. From here, by employing machine learning algorithms, we can gauge a common answer and compare it to other images that received the same answer and find the similarities in the person’s features. For example, if a woman with red hair is shown and many people say her hair is “ginger”. Then another 100 people with red hair are shown and all receive the same answer of “ginger” hair. If we then show the computer a person with red hair, it should say that the person has “ginger” hair.

I got the chance to take on this project through a module at Queen’s University Belfast called “Computer Science Challenges”. The module gives first-year students the chance to try their hand at a final-year-style project. It’s been a great experience so far and a brilliant way to learn. The main aim of the module is to create something of lasting value. To help understand this project, I was given this description:

“People often make consistent superficial judgements of a strangers’ personality and life from their appearance. While these judgements can often be very inaccurate, the fact that many people can feel a similar way is interesting and can be used to reveal cultural bias and measure the limits of human intuition. The goal of this project is to create a realistic ‘character sheet’ as a JSON data structure that defines a person and their appearance in an image. By getting crowd workers to label fictional and real people we can analyse how people make judgements about others and how accurate and consistent such judgements can be.”

WikiData

WikiData is where I started this project. It is the largest collection of formalised data about people, places, things, etc. The site provides great sample data, the data I used came from a collection of fictional characters’ information, which can be found here. This collection proved useful when testing the JSON editor. If you are unfamiliar with the JSON structure, I would recommend looking at the example below and getting to grips with the format:

"entities": {

"Q96282709": {

"modified": "2020-06-14T21:54:24Z",
"id": "Q96282709",
"labels": { "en": { "language": "en", "value": "Abed Nadir" } },
"descriptions": {

"en": {

"language": "en",
"value": "fictional character on the NBC/Yahoo! Screen sitcom series Community"

}

}

}

Flask

The character sheet is displayed as a HTML form, which needs a web server to run on. To do this, I used Flask as it is easy to learn, just by reading the user guide, and provides debugging features that help tweak parts of the web application without hassle. Flask is a micro web framework written in Python. A web framework is something that makes developing web applications easy, the micro part just refers to the minimalistic simplicity of the framework. To install Flask, use the command prompt in your virtual environment and run:

pip install Flask

When Flask is running, the JSON Editor can be displayed and modified however you like. The files for the JSON Editor can be found here. Flask offers many features such as allowing you to get form feedback through the “request” object. To use this in your project, import request from the flask library and from there you can use the request object. Some uses of request are:

  • request.method which gets the HTTP method of form submission
  • request.form which retrieves form data

Character Sheet Editor

To ease editing on the character sheet, I created an Editor web application which navigates through a JSON file, detects the type/instance of the value, and creates HTML form elements based on that type. It is written in Python, as the “json” library allows you to read JSON file data and edit it. If no specific type, such as date/image/table, is specified the program gets the base instance of the value e.g., string, int, etc. and converts it to a HTML form element based on that. As of right now, the Character Sheet Editor is not extremely pretty however it functions well and allows us to move forward.

Form Creation

To start form creation, the JSON file must be “loaded” into the program. This is done by using the load() function of the json library. This turns the file into a dictionary which can then be further parsed through using a recursive method which iterates through the dictionary.

First, the program checks if a type is specified for the value, if it is then the type is compared to the types the program can currently handle - Text, Number, Boolean, Date, Image . If the type is equal to any of the aforementioned sort, then the appropriate method is called.

Otherwise, if a type is not specified, the instance of the current dictionary element is checked. The program can handle all possible instances of the dictionary element – dict, list, str, int, complex, float, bool, and “None” values . Whichever instance the element satisfies, then the appropriate method is called, and the dictionary element is converted to a HTML form element.

def json_to_elements(js, html_form, path, label):

if isinstance(js, dict):
return dict_to_element(js, html_form, path, label)
elif isinstance(js, list):
return list_to_element(js, html_form, path, label)
elif js is None:
return none_to_element(js, html_form, path, label)
elif js is True or js is False:
return bool_to_element(js, html_form, path, label)
elif isinstance(js, str):
return str_to_element(js, html_form, path, label)
elif isinstance(js, int) or isinstance(js, float) or isinstance(js, complex):
return number_to_element(js, html_form, path, label)
else:
return -1

As we may be dealing with large amounts of data, it is important to have distinct unique names for HTML elements. To accommodate this, a naming convention was employed where the ID of the element is the path to the value in the dictionary, e.g.:

path = dict[“parent”][0]
ID = dict_parent_0

def str_to_element(js, html_form, path, label):
elementId = path_to_id(path)

# adding html code
html_form.write("\n<div id='"+elementId+"'>")
if isinstance(js, dict):
html_form.write('\n<br><label for="' + elementId +
'_value">' + label.capitalize() + ":</label>")
if len(js['value']) > 15:
html_form.write('\n<textarea style=\"margin-left:15px;\" id="' + elementId +
'_value" name="' + elementId +
'_value" rows = "3" cols = "40">' + js['value'] + "</textarea><br>")
else:
html_form.write('\n<input style=\"margin-left:15px;\" type="text" id="' + elementId +
'_value" name="' + elementId +
'_value" value="' + js['value'] + '"><br>')
html_form.write('\n<input type="hidden" id="' + elementId +
'_type" name="' + elementId +
'_type" value="Text">')
else:
html_form.write('\n<br><label for="' + elementId +
'">' + label.capitalize() + ":</label>")
if len(js) > 15:
html_form.write('\n<textarea style=\"margin-left:15px;\" id="' + elementId +
'" name="' + elementId +
'" rows = "3" cols = "40">' + js + "</textarea><br>")
else:
html_form.write('\n<input style=\"margin-left:15px;\" type="text" id="' + elementId +
'" name="' + elementId +
'" value="' + js + '"><br>')
html_form.write("\n</div>")

Once the form was created, it needed to be displayed. This was done through Flask because it makes getting form feedback very easy.

Getting Feedback

To get feedback from the form, we use the “request” object from Flask. This gives access to form elements after hitting submit. The “methods” parameter of the app must also be set to “[‘GET’,’POST’]”. By doing this, the form can return information in either GET-mode or POST-mode, for more information. After having set all these parameters, “request.form” is passed as a parameter to the parse form method of the “formalise” Python program.

@app.route("/", methods=['GET', 'POST'])
def start():
formalise.create_form("Formalisations\Person\Person.json")
html_form = open("Website/index.html", "r", encoding="utf-8")
out = html_form.read()
if request.method == 'POST':
formalise.parse_form(request.form)
return out
Parsing Feedback

Since the request.form object had been passed as a parameter, the parse form method can now turn it back into JSON syntax. The instance of “request.form” is ImmutableMultiDict, which is a type of dictionary used in Python where mapping multiple values to a single key is possible. As it is a type of dictionary, each key can be retrieved. In the parse method, a new empty dictionary is created to hold the JSON data. Then it starts to iterate through each key in request.form, getting the ID, converting the ID to a path, and splitting the path into “pathParts”.

for key in feedback:
value = feedback[key]
path = id_to_path(key)
pathParts = path.split('/') # create path parts list

After this, a parent variable is created, and the value is the empty dictionary made earlier. This parent variable serves as the direct parent of the element the iteration is currently parsing. From here, the method iterates through each “part” in “pathParts”, checking if it is the either the first, last or if the part is already in the parent dictionary.

parent = jsonDict
for ind, val in enumerate(pathParts): # iterate through path parts
if(ind == 0):
continue
elif(ind == (len(pathParts)-1)):
parent[pathParts[ind]] = str(value)
elif(pathParts[ind] in parent):
parent = parent[pathParts[ind]]
else:
parent[pathParts[ind]] = {}
parent = parent[pathParts[ind]]
jsonDict = convert_type(jsonDict)

First part = part is the top-level dictionary
Last part = part is the element we are currently parsing
If part is in dictionary = parent variable is set to this part

If the part is the first part, the method continues as it is the top-level dictionary. If it is the last part, the value is set to the key's value (“feedback[key]”). If the part is already in the parent dictionary, the parent is set to this new part. In any other case, the part is set to an empty dictionary and the parent is set to this part. All that is needed after this is to use the dump() function of the json library, passing the dictionary containing the updated JSON data and the file it is written to.

# opening a new JSON file and writing the dictionary to it
jfile = open('feedback.json', "w")
json.dump(jsonDict, jfile)

Formalisations of Things

In the GitHub Repository for this project, there is a “Sample Data” directory. In this directory, there are subfolders – Clothing, Career, and Person.

Clothing contains JSON files of different articles of clothing, with each file having the variations of that article of clothing, e.g., Shirt.json will have polo shirt, rugby top, crop top, etc.

Career contains the OStarNet.json which is a file containing about 1000 Job Titles and their associated O*NET-SOC Code and Description.

Then we have the Person folder, which contains the blank Person.json, which is the template for the character sheet so far, along with 13 instances of the template using fictional character data. All of these can be extended further and improved upon, but this is just a start for formalising a person.

The Person.json file is separated into two sub-dicts – “visible”, concerning visible traits of a person, e.g., hairstyle, eye colour, age, etc. and “invisible”, concerning the invisible traits, such as personality, net worth, romantic life, etc. This is where the most work can be done, adding new traits and ways of formalising a person.

StyleGAN2-ada

Another student, Dean Mulholland, is also working on this project. His focus was on StyleGAN2-ada. In his work so far, he has been able to use StyleGAN2-ada to generate an image of something based on a dataset of examples and a pretrained network. The technology is very interesting, and you can find more about it here. In the project, the end goal is to use it to analyse an image of a person and return the character with all values filled. It will be able to look at someone and hopefully return accurate descriptions of their personality.

StyleGAN Sample Image

Future Plans

Surveying and Machine Learning

To gather our data for recognising characteristics, we will use Amazon Mechanical Turk which is a crowdsourcing website which uses humans to perform tasks that computers cannot. To gather reliable data, we will have to get quite a lot of data ranging from asking participants to rate a person’s personality traits, whether they think the person is more introverted than extroverted, what career industry they might work in, what car they might drive. The survey will take the form of a web application where, given an image of a person, the user can speculate the values of this person’s character.

Once we have sufficient values so that we can have an educated guess at a person’s characteristics, given their face, and we can do this for enough images of people, we can use the StyleGAN2-ada code to convert the image to a form where we can build machine learning models to predict how people infer characteristics about a person from their appearance. An example being the connection between what they assume of the person’s personality and what industry they work in. We can also work off derived values, these being values we get from primary values like hairstyle, hair colour and making the link between those and a person’s music taste. With the amount of ways we can describe and formalise a person, it will be very interesting to see the formative factors that go into shaping a person's life and how small things may affect their lives in huge ways.

Useful Links

To continue this project, you will need:

Some helpful links which you may find useful:


Contact Me