From “Hello World” to an Array of A.I.-powered Design

One of the early success stories of “generative A.I. meets human creativity” is “Hello World,” the world’s first album composed by an artist, SKYGGE, with artificial intelligence. It began as a Sony research project (the Flow Machines project) to augment human creativity with algorithms to capture and reproduce the concept of musical style and generate new, compelling musical material of all sorts from melodies and harmonies to timbre and rhythms. Along the way, under the artistic direction of SKYGGE, the artists took control, and the scientific research became a music project.

In 2016, when CESAR’s Chairman of the Board Giordano Cabral was working with his former PhD advisors François Pachet (from SONY Computer Science Lab Paris) and Jean-Pierre Briot (from Sorbonne University), both pioneers in generative A.I. science, he witnessed the preparation of the release of “Daddy’s Car,” the first pop song ever co-written by an A.I. system, called FlowMachines. This first A.I.-assisted song development is a “catchy, sunny tune reminiscent of The Beatles” and it was “pretty good, actually,” per a news story about its release via Quartz.

Giordano later joined the project, which led to the first-ever A.I.-generated music album, “Hello World”.

The world’s first AI-assisted song called “Daddy’s Car” – composed by a combo of musician and a generative machine – was released in 2016. It was reminiscent of the Beatles’ sound.

After analyzing a database of songs, FlowMachines identified and followed a particular musical style to create similar compositions – much like today’s smart internet music platforms from Apple to Pandora to Spotify work to find and deliver music that’s pleasing to various listeners – only the Sony Labs’ A.I. generated similar compositions. In the case of both initial singles released in 2016 –including “a dreamy ditty called “Mr Shadow,” created in the style of American musicians Irving Berlin, Duke Ellington, George Gershwin and Cole Porter — French composer Benoît Carré arranged the songs and wrote the lyrics, so it was a combo of man and machine that produced them.

The dizzying pace of change in just seven years

“Music was an interesting place to start with generative A.I. because it lies in the mid-path between subjectivity and objectivity. Computers tend to be really good for objective things – but the classical human characteristic of creativity is mostly a subjective thing,” said Cabral during a recent night out with his Sony Lab advisors in Paris who are two of the world’s rock stars in the domain of music co-creation with A.I. “For me, it was a fantastic kick-off point for this experimentation in a world-class lab.”

CESAR’s Board Advisor Giordano Cabral was a PhD student at Sony’s CSL Research Lab in Paris and he helped with the development of the world’s first AI-assisted album: “Hello World.”

“After early pilots with music and A.I., we began to get more rapid advances in technologies such as “style transfer” – wherein you could upload a photo, for example, and say to the machine: ‘Render it so it looks like Van Gogh painted it.’ Then came prompt-based technology, which lets nearly anyone create nearly anything they want if they can describe in a way the machine understands – like: ‘Create me an image of the current Pope of the Catholic Church riding on a skateboard through a small crowd of people near the Vatican in Rome, Italy as the sun begins to dip below the horizon.’”

While artists have experimented with more crude computer-generated images since the 1960s, during the last decade the field of “generative art” has exploded. For some artists, graphic user interfaces and computer code have become an independent art form in themselves. For example, British-born Adrian Ward is a software artist and musician “known for his generative art products released through his company Signwave, and as one third of the techno gabba ambient group, Slub.” Today, there are numerous artists around the globe using A.I. to generate everything from abstract paintings, screenplays, and comedy skits to music, poetry, and book translations.

Today’s new crop of generative A.I. applications such as DALL-E and Midjourney are expected to disrupt creative work that has traditionally been within the realm of humans – and potentially upend a “creator economy” that’s currently valued at $14 billion per year globally, according to a recent article published by the Harvard Business Review.

Of the potential outcomes, the HBR article’s authors present three of them:

1) There will be an explosion of AI-assisted innovation “where machine augmented human creativity will enable mainly rapid iteration.”

2) Machines will monopolize creativity and a dystopian reality will emerge where fewer humans make less art and content, and only a “handful of established artists dominate the market with a long tail of creators retaining minimal market share.”

3) “Human-made commands a premium” and real people will maintain a competitive edge over algorithmic competition. This scenario will require political leaders to act “to strengthen the governance of information spaces” required to deal with the downside risks including being overwhelmed with false or misleading content that will require more human-powered oversight and governance to address.

Rapid advancements need societal oversight

Every time there’s been a new technological era – from the Industrial Revolution’s transition from creating goods by hand to using machines from 1760 to 1840, to the modern Electronic Age that has spawned personal computing, the internet, social media, cloud-based computing and the IoT – there has been a need for society to come together as a whole to address what needs to evolve along with those changes.

“It’s not only the responsibility of the tech industry to provide oversight and governance,” said Cabral. “Some kinds of earlier criticisms of A.I., such as ethnic bias, are already on the path to being addressed. But we, as a society, need to be vigilant because there are often inherent risks with technology’s evolution from protecting children on the internet and social platforms to preventing fraud and cybercrime when it comes to e-commerce.

“When Microsoft released new chat technology that learned from databases of social networks, they ran into issues with hate speech. That new A.I. was disabled in a matter of hours,” continued Cabral. “It is sometimes difficult to predict or avoid issues that crop up before they arrive because many of them are surprises, even for developers.”

The broad societal benefits may be worth it

“One of the things I firmly believe when it comes to balancing the risks vs. the benefits of generative A.I. for mankind is the concept of co-creation between people and machines,” said Cabral. “Co-creation means that machines will not substitute people. Instead, they will act as intelligent partners that design new masterpieces together. For machines to reproduce or augment something from its original form is simpler to create something from scratch because it has existed in a similar form in the world before. This new breed of smart machines connected to advanced A.I. can learn how to rapidly detect, analyze, and classify things. The combined creative intelligence of humans paired with machines can up-level people’s artistic skills using human-powered A.I.”

The concern over creators being replaced by GenA.I. tools has been heating up in recent months. One of the key issues in the Writers Guild of America strike that commenced on May 2 is TV, news, radio, and online writers’ fears of “being replaced by A.I., having their work trained by A.I., or being hired to punch up A.I.-generated scripts at a fraction of their former pay rates,” according to an analysis and critique by The New York Times’ James Poniewozik titled “TV’s War with Robots Is Already Here.”

TV Critic Poniewozik writes: “In the perceptive words of ‘Mrs. Davis,” the wildly human comedic thriller about an all-powerful A.I.: ‘Algorithms love cliches.’ There’s a direct line between the unoriginality of the business – things TV critics complain about, like reboots and intellectual-property adaptations and plain, old derivative stories – and the ease with which entertainment could become bloated by machine-generated mediocrity.”

On the other hand, various forms of A.I. could be TV writers’ new best friends by, for instance, helping them come up with plot twists and endings designed to please viewers – in part, by capturing and analyzing their comments about shows on social media.

There are clearly some societal benefits when it comes to GenA.I. For example, some of the research projects CESAR is now working on may open up new possibilities for those who are physically challenged because they are blind or deaf. Due to large language models, we can translate between imagery and language and vice versa – so it’s much faster today to transcribe a movie’s subtitles into accurate text so it can be read by those who are hearing impaired, or to provide additional context about what’s happening on the screen for those who are visually impaired.

aidesinggenerative ai