UI professors discuss artificial intelligence, computer vision

By Matt Troher and Matt Novelli

The machines are learning.

As more and more of the contemporary world is being automated for maximum efficiency, computer software has the potential to complete tasks usually designated to humans.

Artificial intelligence is the ability of a computer or a robot controlled by a computer to do tasks that are usually done by humans, as the tasks require human intelligence and discernment. Computer vision is a subset of AI, in which computer systems use image-capturing software to understand and interpret the visual world. Computer science professors at the University of Illinois at Urbana-Champaign utilize AI and computer vision both inside the classroom and outside.

AI is becoming increasingly common in the contemporary technological landscape. However, misconceptions about the field still remain. Dr. David Forsyth, professor in Engineering at the University of Illinois at Urbana-Champaign, noted that despite the recent startup fervor surrounding AI and computer vision, most of the systems’ applications are in less-flashy, practical ways.

“Everybody and their cat is out there starting up artificial intelligence companies and mostly it boils down to a classifier,” Forsyth said. “Computer vision involves doing useful things with pictures … for example, airport surveillance, you take movies of people picking up bags at the where air traffic arrives, and then you use automated methods to follow the bags and make sure that the bag leaves customs with the same person who picked it up.”

Constructing an AI system can utilize machine learning, a method of data analysis that automates analytical models. However, to get a system started, data must be entered from human inputs. Dr. Justin Leiby, professor in Business, noted that some machine learning algorithms use swaths of online workers to identify images, thus training certain algorithms. When an individual identifies a certain image, the algorithm takes note and begins to make connections between the visual input and the individual’s classification.

“It’s a tool to teach to teach the algorithms,” Leiby said. “This is sort of their training device. The human beings are able to (classify images), and you can start mimicking human classification if you have enough people doing it. So get a couple thousand people to answer a question, show them picture a chair and say, ‘What room of the house with this chair belong?’ And suddenly, because thousands of people are telling an algorithm how to recognize a chair that goes in the living room versus a chair that goes in an office from its picture, the algorithm starts to learn to make its best guess.”

Dr. Derek Hoiem, professor in Engineering, has focused the majority of his career on studying how AI and computer vision can make the world a better place. Recently, Hoiem has taught courses such as CS 445: Computational Photography, CS 598: 3D Vision and CS: 543 Computer Vision to undergraduates and graduate students alike.

As AI research develops, systems have gotten more sophisticated and adept at analyzing a wide range of data. Yet, there is still room for improvement. One of Hoiem’s current research directions focuses on general purpose vision – a way to make an AI system more capable of solving wide ranges of tasks instead of focusing intently on a singular specific task.

“So for example, like with people, we can see and we have hands and we move our bodies to take in information,” Hoiem said. “So we can do all kinds of tests that conform to our senses and the actions we can take, and we can potentially learn to do the task and do it well. The same idea is applied to general purpose vision, the idea is that within that scope of what the AI system can see and what it can do, it can do any task that scope allows. So it can learn to detect objects, or it can learn to provide captions to images or classify images.”

Hoiem doesn’t just focus on AI and computer vision from an academic perspective. Hoiem also works as the chief strategy officer of Reconstruct – a company using computer vision to map construction sites and provide real-time updates to on-site subcontractors. Hoiem, along with colleagues Mani Golparvar and Tim Brettell, founded Reconstruct in 2015 to apply computer vision to solve a long-standing issue with construction: delays.

“This is a really important way that computer vision can benefit society,” Hoiem said. “Construction and builder infrastructure undergrids our economy, and there’s a potential to address this critical need using the ability to create 3D models from photographs and do recognition. It’s well known that construction is often way behind schedule and over budget. Our idea was that we can use computer vision to provide that situation awareness for construction sites.”

At the forefront of the contemporary conversation around AI and computer vision is ethics. With the technology already having been established, researchers and activists are beginning to discuss the implications of how AI is created and how it is utilized.

In 2019, a recent AI project called Speech2Face – which utilized machine learning to create an algorithm designed to develop facial images based off speech recording – came under criticism after sociologists and computer scientists raised concerns over the ethical implications of the project.

When utilizing machine learning, systems are only as good as the data received as an input. Although the computer system itself may not be biased, the information it receives is still determined by a human who is susceptible to their internal biases. Examples include facial recognition software only trained to recognize predominantly light-skinned men, failing to recognize those with darker skin.

Hoiem noted that one main ethical problem stems from how the information developers input into systems can be biased, whether that bias be explicit or implicit.

“Ethically, where this really creates problems is that sometimes AI researchers and companies will create a product that will be developed based on some distribution of images or data,” Hoiem said. “For example, if it’s trying to identify faces, it might be based on employees of the company and other subjects that they were able to get data for, but that might not represent the broader demographic of the user base or the environment it will be applied to.”

[email protected]

[email protected]