You are on page 1of 12

Article

Cite This: ACS Nano 2019, 13, 7471−7482 www.acsnano.org

A Self-Consistent Sonification Method to


Translate Amino Acid Sequences into Musical
Compositions and Application in Protein
Design Using Artificial Intelligence
Chi-Hua Yu, Zhao Qin, Francisco J. Martin-Martinez, and Markus J. Buehler*
See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

Laboratory for Atomistic and Molecular Mechanics (LAMM), Department of Civil and Environmental Engineering, Massachusetts
Institute of Technology, 77 Massachusetts Avenue 1-290, Cambridge, Massachusetts 02139, United States
*
S Supporting Information
Downloaded via 90.162.109.212 on June 10, 2020 at 22:26:24 (UTC).

ABSTRACT: We report a self-consistent method to translate


amino acid sequences into audible sound, use the representa-
tion in the musical space to train a neural network, and then
apply it to generate protein designs using artificial intelligence
(AI). The sonification method proposed here uses the normal
mode vibrations of the amino acid building blocks of proteins
to compute an audible representation of each of the 20 natural
amino acids, which is fully defined by the overlay of its
respective natural vibrations. The vibrational frequencies are
transposed to the audible spectrum following the musical
concept of transpositional equivalence, playing or writing
music in a way that makes it sound higher or lower in pitch
while retaining the relationships between tones or chords
played. This transposition method ensures that the relative
values of the vibrational frequencies within each amino acid and among different amino acids are retained. The
characteristic frequency spectrum and sound associated with each of the amino acids represents a type of musical scale
that consists of 20 tones, the “amino acid scale”. To create a playable instrument, each tone associated with the amino
acids is assigned to a specific key on a piano roll, which allows us to map the sequence of amino acids in proteins into a
musical score. To reflect higher-order structural details of proteins, the volume and duration of the notes associated with
each amino acid are defined by the secondary structure of proteins, computed using DSSP and thereby introducing
musical rhythm. We then train a recurrent neural network based on a large set of musical scores generated by this
sonification method and use AI to generate musical compositions, capturing the innate relationships between amino acid
sequence and protein structure. We then translate the de novo musical data generated by AI into protein sequences,
thereby obtaining de novo protein designs that feature specific design characteristics. We illustrate the approach in several
examples that reflect the sonification of protein sequences, including multihour audible representations of natural
proteins and protein-based musical compositions solely generated by AI. The approach proposed here may provide an
avenue for understanding sequence patterns, variations, and mutations and offers an outreach mechanism to explain the
significance of protein sequences. The method may also offer insight into protein folding and understanding the context
of the amino acid sequence in defining the secondary and higher-order folded structure of proteins and could hence be
used to detect the effects of mutations through sound.
KEYWORDS: protein, structural analysis, sonification, artificial intelligence, recurrent neural networks, molecular mechanics

M aterials and music have been intimately connected tions of materials in distinct spaces such as sound or language
throughout centuries of human evolution and to advance design objectives.2,7−9 The approach proposed here
civilization.1−4 Indeed, materials such as wood, is that the translation of protein material representations into
animal skin, or metals are the basis for most musical
instruments used throughout history.5,6 Today, we are able Received: March 20, 2019
to use advanced computing algorithms to blur the boundary Accepted: June 5, 2019
between material and sound and use hierarchical representa- Published: June 26, 2019

© 2019 American Chemical Society 7471 DOI: 10.1021/acsnano.9b02180


ACS Nano 2019, 13, 7471−7482
ACS Nano Article

music not only allows us to create musical instruments but also hierarchical structures of protein sequences in musical space
enables us to exploit deep neural network models to represent and use it to generate protein designs through this translational
and manipulate protein designs in the audio space. Thereby we approach: material to music, solving a design problem in
take advantage of longer-range structure that is important in musical space, and translation back to material.
music and which is equivalently important in protein design The plan of the paper is as follows. We first present an
(in connecting amino acid sequence to secondary structure analysis of the translation of the vibrational spectra of each of
and folding).10−15 This paradigm goes beyond proteins but the 20 amino acids into audio signals, using the concept of
rather enables us to connect nanostructures and music in a transpositional equivalency. We then report various sonifica-
reversible way, providing an approach to design nanomaterials, tions of known protein structures into musical scores, extracted
DNA, proteins, or other molecular architectures from the from the Protein Data Bank (PDB). Based on a large number
nanoscale upward. of sonified protein structures we train a recurrent neural
Electronic means to generate sound has been an active field network and generate musical expression using AI. We then
in music theory,16 as evidenced in various computer-based map the musical scores generated by AI into amino acid
electronic synthesizers. These methods typically aim to create a sequences and analyze the resulting protein structures. The
spectrum of overlapping waves either to mimic the sounds of overall approach reported in this paper is presented in Figure
natural instruments (such as a piano, guitar, or classical string 1, showing how the mapping and reverse mapping closes the
instruments) or to generate sounds that do not naturally exist loop between material manifestation and musical space and
(such as done in early synthesizers such as the Moog and back.
Roland synthesizers [SH-1000, SH-101, and so on] and more
recently methods such as granular synthesis and wavetable
synthesis). In our lab’s earlier work we considered sonification
of spider webs17,18 and whole protein structures,19 and we also
presented mathematical modeling approaches using category
theoretic representations to describe hierarchical systems and
their translations between different manifestations (e.g.,
between materials, music, and social networks).2,20−22 Other
scientific inquiry in this general area proposed sonification
methods of protein sequences by mapping them onto Western
classical musical scales23−25 or more broadly representing
various scientific data as sound.26−28
In this study we propose a formulation of sonification and
generate a method by which the amino acid sequence of
proteins, the most abundant molecular building blocks of
virtually all living matter, is used to generate audible sound
through consideration of the elementary chemical and physical
properties of amino acids and apply it to generate designer
materials. We explore a distinct avenue of exploration of the
vibrational normal modes of amino acids, reflecting a broad Figure 1. Overall flowchart of the work reported here, closing the
range of diverse protein materials in nature.29,30 loop between different manifestations of hierarchical systems in
The proposed sound-based generative algorithm is based on material and sound and the reversible translation in between the
the natural vibrational frequencies of amino acids. Generally, two representations. Future work could generate musical
the vibrational spectra of molecules can be computed by expressions by human compositions and thereby lead to de novo
computational chemistry methods such as density functional amino acid sequence designs and de novo proteins. In this paper,
theory (DFT)31−33 or molecular dynamics (MD).34−36 We use we generate musical compositions using AI, offering a design
method for proteins. A key insight from this overarching approach
a computer algorithm to convert these inaudible vibrations
is that we can use the neural network to generate music that is
into a space that the human ear can detect. By making these innately encoded with patterns reflecting the design principles of a
natural vibrations of the proteins audible, they can then be certain group of protein structures. This encoded information in
used to creatively express sound and generate music that is the audio can then be turned into protein sequences that are not
based on the complex vibrational spectrum offered by these included in the training set but that resemble the set of desired
protein structures. This offers an avenue to sonify the features. This means that the neural network has learned the
characteristic overlays of natural frequencies and to use them design principles by which certain structural features are generated
as a playable musical instrument. from the sequence of amino acids, closing the loop between
The significance of considering vibrations as a means to material → sound → material.
translate between material and sound has broader ranging
implications. For instance, it was suggested that protein
vibrations play a role in information processing in the brain,37 RESULTS AND DISCUSSION
protein expression,38 or the growth of plants.39,40 The use of A detailed description of methods used in this work is included
AI in understanding and classifying proteins and predicting de in the Methods section. Since it is the basis for sound
novo amino acid sequences has been explored in recent generation, we first review the frequencies generated by each
literature and presents an opportunity for further research amino acid, as depicted in Figure 2. Figure 2(a) shows the
investigations.41−44 Other work has applied AI to design frequencies of the vibrational modes, from lowest to highest.
composites, which can offer an efficient means to materials by The data show that each amino acid is associated with a
design and manufacturing.45,46 Here we apply AI to learn particular frequency spectrum. The heaviest amino acid, TRP
7472 DOI: 10.1021/acsnano.9b02180
ACS Nano 2019, 13, 7471−7482
ACS Nano Article

Figure 2. Analysis of vibrational spectra of amino acids based on DFT data. (a) Depiction of the frequencies associated with each of the 20
amino acids computed based on DFT, with original data taken from ref 31, where the x-axis reflects the number of the mode whose
frequency is plotted. In the sonification approach, the sound of each amino acid is generated by overlaying harmonic waves at the said
frequencies and playing them together, creating a complex sonic spectrum associated with each of them. (b) Depiction of the lowest
frequency of each amino acid, sorted from smallest to largest. The range of notes of the base frequencies, in terms of conventional musical
scales, is approximately from B0 to F4 (spanning around 3 octaves). However, the total sonic character of each amino acid is more complex,
as it is created through the overlay of all frequencies shown in panel (a).

(tryptophan), shows the slowest increase of frequencies (and the 12 tones per octave used in Western classical music
also the most modes, since it has the most degrees of (detected notes are displayed in the form of “blobs” in the
freedom). GLY (glycine), the lightest amino acid, shows the analysis). The analysis shows that the character of each amino
fastest increase of frequencies with modes (and also the fewest acid sound is composed of multiple-frequency clusters,
modes, since it has the smallest degrees of freedom). representing a concept similar to a musical chord. Further,
Figure 2(b) depicts the lowest frequency of each amino acid, the data show that while some frequencies fall on piano keys,
sorted from smallest to largest. We find that the range of notes most are in between keys, representing a complex collection of
of the base frequencies, in terms of conventional musical scales, frequencies. Attempting to fit a natural musical to the data, the
is approximately from F2 to C#5. However, the sonic character algorithm predicts the best fit overall to a C minor scale. Table
of each amino acid is much more complex and does not follow 1 shows the results of an analysis where the best fit to a musical
conventional tunings, as it is created through the overlay of all scale for each of the 20 amino acids is presented. The analysis
naturally occurring frequencies. The frequencies of each amino suggests that the soundings of the amino acids are reflected
acid are included as Supporting Information, with the results through a set of major and minor scales, with varying degrees
for both DFT B3LYP 31 and MD CHARMM based of fit to those scales. A sweep through the sounds associated
computations of the eigenfrequencies (supplementary files: with each of the 20 amino acids (ALA, ARG, ASN, ASP, CYS,
ALL AAs - CHARMM - frequency data.txt and ALL AAs - GLU, GLN, GLY, HIS, ILE, LEU, LYS, MET, PHE, PRO,
DFT - frequency data.txt). Note that although we performed SER, THR, TRP, TYR, VAL) is represented in 20 AA sweep −
the analysis with both DFT B3LYP and MD CHARMM data, DFT.mp3 (all audio files referenced in this paper are attached
only the DFT-based data are used from hereon, as they are as Supporting Information).
considered a more accurate representation of the vibrational The CHARMM-based sonifications shows a similar tonal
spectra of amino acids. The MD CHARMM data, however, characteristic, although there are differences in the frequency
can be useful for consistency, if additional sonifications of spectra (generally, the frequencies predicted by CHARMM are
larger molecules or entire proteins are considered. Using the higher, which we attribute to the assumption of the point
MD CHARMM data in such cases would allow for a consistent charges in CHARMM that result in fewer degrees of freedom,
prediction of tonal character across further hierarchical scales. more rigidity, and thus higher frequency). We note that due to
We find that the lowest frequency generated across all 20 computational limitations, DFT is not a feasible approach to
amino acids stems from the TYR (tyrosine) residue, and in our simulate the vibrations of very large molecules or complexes of
algorithm, it is represented by a value of 61.74 Hz. The highest large molecules. CHARMM, on the other hand, offers a
produced frequency is around 20 000 Hz. The audible computationally more efficient way to compute molecular
frequency spectrum of humans is within the range of 20 to vibrations, which can be scaled to millions of atoms and
20 000 Hz, and hence most generated protein vibrations fall beyond.
within that range. It is noted that TYR has the lowest base frequency, as
Figure 3 shows an analysis of the frequency spectrum of the confirmed in Figure 2. In the melodic analysis shown in Figure
20 amino acids, mapped onto a piano keyboard that features 3 (top) where the frequency spectrum is mapped onto a piano
7473 DOI: 10.1021/acsnano.9b02180
ACS Nano 2019, 13, 7471−7482
ACS Nano Article

Table 1. Analysis of Musical Scale Associated with Each of


the 20 Amino Acids, Determined Based on a Best Fit
Analysis of the Sound Spectruma
amino acid musical scale fitted to its frequency match to scale
identifier spectrum score
ALA Bb major 70%
ARG F major 57%
ASN A major 75%
ASP C major 63%
CYS D major 78%
GLU Eb major 60%
GLN E major 67%
GLY B minor 43%
HIS F major 50%
ILE Eb major 63%
LEU Eb minor 25%
LYS Eb minor 50%
MET B major 50%
PHE E minor 33%
PRO Eb minor 71%
SER Eb minor 57%
THR G# minor 43%
TRP C# minor 71%
TYR E minor 56%
VAL G minor 57%
a
The match to scale score is calculated based on the ratio of how
many notes of all notes included in each amino acid sound fall onto
the scale that was determined as best fit. The data show that the fit to
scale ranges from 25% (for LEU) to 78% (CYS).

created music for further processing. A series of screenshots of


the app is shown in Figure 4 and is available for download in
the Google Play store as “Amino Acid Synth” (Google Play is a
trademark of Google Inc.). The app features 20 keys
representing the notes in the “amino acid scale” and allows
Figure 3. Top: Analysis of the frequency spectrum of the 20 amino users to interactively create melodies that represent amino acid
acids (horizontal axis, labels at bottom) mapped onto a piano roll sequences. Functional details of the app are explained in the
(left vertical axis) that includes the 12 semitones assigned in caption of Figure 4.
Western classical music. The analysis shows that the character of
each amino acid sound is composed of multiple frequency clusters,
We now apply the protein sonification method described in
representing a concept similar to a musical chord. Further, the the Methods section to translate protein sequences into
data show that while some frequencies fall on piano keys, many are musical expressions and process those further by focusing on
in between keys, representing a complex collection of frequencies. what musical features resemble certain protein features. Table
The left side of the graph depicts a piano roll (with its 2 summarizes a set of protein structures and associated audio
characteristic combination of white and black keys, 12 of them files, all created based on existing protein structures. The name
per octave and repeating), with the labels indicating the note of each file corresponds to the PDB ID as listed in https://
associated with each key. The bottom graph shows a spectral www.rcsb.org. The proteins translated into musical expressions
analysis of the audio produced for all 20 amino acids, for a include 194l (lysozyme), 107m (myoglobin), 6cgz (β-barrel), a
frequency range from 50 to 20 000 Hz. silk protein, amyloid protein, and others. Figure 5 shows an
example of the musical score for 194l (lysozyme), featuring a
keyboard, however, the TYR residue does not feature the musical piece with a 21 bar length. The musical score
lowest frequency. A melodic range spectrogram analysis of the illustrates the interplay of pitch (reflecting different amino
original audio data depicted in Figure 3 (bottom) confirms acids) and rhythm (reflecting different secondary structures),
that, indeed, TYR features the lowest base frequency in the altogether reflecting the protein fold in musical space.
produced audio. We attribute this to the algorithm used to A general weakness of sonification approaches alone is that
map a complex sound onto a piano roll, reflecting the they are not necessarily enough, on their own, to understand
individual components of the sound and lacking some detail of protein structure upon listening. At a minimum, it is strongly
the lower frequencies for this sound. More analysis could be dependent on a person’s experience, training, and musical skill.
done to understand better why the algorithm does not reflect To overcome this limitation, we propose using an AI approach
this detail. to capture the expressions of the hierarchical structures of
To make the sounds of amino acids accessible to and proteins in musical space through a neural network. Once
playable by a broad audience, we developed a phone app that trained against a data set, the neural network is capable of
allows users to play the various soundings in an interactive predicting musical expressions that resemble proteins that were
manner, record and edit played sequences, and share the not part of the training set. This overarching framework,
7474 DOI: 10.1021/acsnano.9b02180
ACS Nano 2019, 13, 7471−7482
ACS Nano Article

Figure 4. Screenshots of the phone app, which allows users to play the sounds associated with each amino acid interactively, explore the
sonic landscape, and record played sequences and share them (for further processing, e.g., to synthesize or computationally fold sequences
created through playing the instrument). (a) Primary screen. (b) Amino acid keyboard, where each key on the phone app is assigned to the
sound of one amino acid type, which plays upon touch. (c) Built-in sequence editor to change sequences played interactively with the
keyboard. Space can be added to distinguish multiple protein chains. (d) Information panel of the app, giving scientific background and a
reference to this paper. This app is published for free public download (https://play.google.com/store/apps/details?id=com.synth.
aminoacidplayer; source code of the app is attached in the SI).

summarized in Figure 1, allows one to utilize sonification as a representation of notes as bars (bottom) to show rhythmic
design method through the use of AI. detail. The data are shown for three proteins in the training set
We train three neural networks reflecting the music (α helix rich 107m, β sheet rich 6zg, and a mix of various
generated by distinct protein classes. Details of the methods secondary structures in 194l). Figure 7 shows similar data for
used are included in the Methods section. Training set #1 one of the de novo predicted proteins, revealing how secondary
includes a set of β sheet (BS) rich proteins, training set #2 is α structure information is also encoded in the predicted proteins,
helix (AH) rich proteins, and training set #3 is a combination in agreement with the ORION and MODELER predictions.
of the former two. We then use these trained neural networks Figure 8 shows a melodic spectrum for the sonified
to generate musical scores and then translate the musical representation of lysozyme with PDB ID 194l. The figure
scores back into amino acid sequences to obtain a set of de depicts also a comparison with the secondary structure,
novo proteins, whose folded structure we analyze. revealing how certain acoustic patterns are associated with
Table 3 summarizes the results of these AI-generated certain secondary structures of the protein.
musical compositions as well as images of the de novo proteins An interesting insight from this work is the interplay of
designed by AI. We note that while the musical representations universality and diversity. The elementary building blocks of
that are used to train the neural networks include both proteins, e.g., amino acids and secondary structure types, are
sequence and secondary structure information, when we limited. However, the structures that are built from these, using
translate the musical scores back into amino acid sequences, hierarchical principles of organization, are complex and
we solely capture the sequence of amino acids. This serves as a responsible for proteins being capable of acting in many
way to test the predictive capabilities of the neural networks as functional roles (e.g., enzyme, structural material, molecular
to whether or not they are capable of predicting proteins with switch). The expression of sequence in a musical space offers a
the desired secondary structures and higher-order folding means to understand how different length scales determine
patterns. Indeed, the data in Table 3 show that the neural function. It can be seen that the AI-generated musical
networks are capable of achieving this feat, as they are capable compositions and amino acid sequences show similar repetitive
of designing proteins with the desired features: AH-rich characteristics as seen in the protein training set. The analysis
proteins, BS-rich proteins, or a mix of the two. in Table 3 shows that the structure of the resulting proteins
Further addressing this issue, we analyze the predicted reflects those characteristic features learned in the training set,
musical patterns to better understand how secondary confirming that the model is able to capture key structure−
structures of proteins are reflected in musical space. Figure 6 functional relationships between amino acid sequence and
shows an analysis of musical patterns, here visualized on a various levels of protein organization. As a specific example, it
piano roll (y-axis) over time (x-axis) (top) and a is possible to design proteins with certain structural features
7475 DOI: 10.1021/acsnano.9b02180
ACS Nano 2019, 13, 7471−7482
ACS Nano Article

Table 2. List of Audio Files Created Based on Existing


Protein Structures (Most of Them Experimentally
Determined and Deposited in the Protein Data Bank
(PDB)), Sonified Using the Approach Described in the
Papera

Figure 5. Musical score generated for the protein 194l (lysozyme),


21 bars long. Note that the notes indicated do not reflect a
conventional musical scale, but that each note in the space of 20
admissible tones in the native amino acid scale is assigned to one
of the 20 amino acids. The score is shown here only for
visualization of the concept and to illustrate the timing, rhythm,
and progression of notes as learned from the amino acid sequence
(in the score C2 = ALA, D2 = ARG, and so on; each of the amino
acids is assigned to 20 notes in the C major scale on a piano roll
[the white keys]). The score illustrates the progression from α
helix rich secondary structures to segments of β sheet folds, to α
helix structures, to random coils toward the end.

encoded information can then be turned back into protein


sequences that are not included in the training set, but that
resemble a set of desired features. This means that the neural
network has learned the design principles by which certain
structural features of proteins are generated from the sequence
of amino acids.
Other future applications could be, for instance, to build a
database of various enzymes and then use the design approach
shown here to develop a set of enzymes that can be used as the
a
The name of each file corresponds to the PDB ID as listed in basis for functional optimization and exploration of a very
https://www.rcsb.org. broad design space. The expression of certain features can be
achieved either by selecting a protein as seed for further
generation or by developing a training set or global
(e.g., α helices, β sheets) with this process. A key insight from conditioning to reflect certain features. Similar concepts as
this is that we can use the neural network to generate music proposed here for protein design may be applied to other
that is innately encoded with patterns reflecting the design nanomaterial design problems and interactions between
principles of a certain group of protein structures. This proteins and nanoparticles.47
7476 DOI: 10.1021/acsnano.9b02180
ACS Nano 2019, 13, 7471−7482
ACS Nano Article

Table 3. Summary of AI Designed de Novo Proteins Using parameters. As demonstrated in the paper, the representation
the Three Neural Network Models Developed and of a protein in musical spacea sort of languagealso allows
Description of Corresponding Audio Files on Which the us to use neural network methods to train, classify, and
Protein Structure Is Based generate de novo protein sequences. Our method here can
provide a useful tool that allows anyone to easily translate
between protein and music, make rigorous analogy with the
training set, and satisfy the given design requests (sequence
seed, secondary structure, etc.). We think this approach may be
generalized to express the structure of other nanostructures in
a different domain (here, sound) that provides a better
interface with human cognition than plain data and may
intrigue more creativity. For example, it will allow humans to
tune music either intuitively or according to music theories or
tools to modify a protein structure. The method offers an
avenue for musical compositions to be translated into protein
sequences and to understand patterns in various forms of
hierarchical systems and how they can be designed.
Proteins are the most abundant building blocks of all living
things, and their motion, structure, and failure in the context of
both normal physiological function and disease is a founda-
tional question that transcends academic disciplines. In this
paper we focused on developing a model for the vibrational
spectrum of the amino acid building blocks of proteins, an
elementary structure from which materials in living systems are
built. This concept could be broadly important. For instance,
at the nanolevel of observation, all structures continuously
move. This reflects the fact that they are tiny objects excited by
thermal energy and set in motion to undergo large
deformations. This concept of omnipresent vibrations at the
nanoscale is exploited here to extract audio as one way to
represent nature’s concept of hierarchy as a paradigm to create
complex, diverse function from simple, universal building
blocks. More broadly, the translation from various hierarchical
systems into one another poses a paradigm to understand the
emergence of properties in materials, sound, and related
systems and offers design methods for such systems where
large-scale and small-scale relationships interplay. Additional
analyses could be performed, for instance by investigating
mutations and other aspects with disease mutations, offering
potential avenues for future work.
The method reported here can find useful applications in
STEM outreach and general outreach to explain the concept of
protein folding, design, and disease etiology (through making
protein misfolding or mutations audible) to broad audiences. It
also offers insights into the couplings between sound and
matter, a topic of broad interest in philosophy and art. Finally,
the AI-based approach to design de novo proteins provides a
generative method that can complement conventional protein
sequence design methods.

METHODS
Vibrational Spectrum. To generate audible sound, we use the
vibrational spectrum of amino acids, defined by the set of
eigenfrequencies, as a basis. We consider two data sets for sound
generation, one that bases the vibrational spectra on B3LYP DFT as
published in ref 31 and the other one where we use CHARMM MD
to compute the same data. In the latter case we use a custom Bash
CONCLUSION script that allows integrating multiple open source software with the
CHARMM c37b1 program to automatically analyze each of the
In this paper we reported an approach to sonify protein amino acids and then compute their normal modes.
sequences and understand protein compositions in a different, Translating a Vibrational Spectrum into Audible Sound. We
musical space. This translation may offer cognitive avenues to use an interactive tool that allows us to generate sounds based on the
understand protein function and how it changes under list of eigenfrequencies provided, implemented in Max 8.03,48,49 and
variations of sequence, secondary structure, and other accessed through a Digital Audio Workstation (DAW) Ableton Live

7477 DOI: 10.1021/acsnano.9b02180


ACS Nano 2019, 13, 7471−7482
ACS Nano Article

Figure 6. Analysis of musical patterns, here visualized on a piano roll (y-axis) over time (x-axis) (top) and a representation of notes as bars
(bottom) to show rhythmic detail. The data are shown for three proteins in the training set (α helix rich 107m, β-sheet rich 6zg, and a mix of
various secondary structures in 194l). The images show how the secondary structures are reflected in musical patterns (examples of specific
areas highlighted in bottom row).

Figure 7. Analysis of musical patterns generated by AI, here visualized on a piano roll (y-axis) over time (x-axis) (top) and a representation
of notes as bars (bottom) to show rhythmic detail, for the longer protein sequence predicted from the AH-BS training set. As shown in the
analysis, the model predicts protein designs with both α helix (toward the beginning of the sequence, protein predicted shown on top left) as
well as β sheet rich proteins (toward the end of the sequence, protein predicted shown on bottom right).

10.1b15 (Ableton is a trademark of Ableton AG).50 Max is a visual amino acid without altering it by confining it to conventional musical
programming language for music and used here to implement a scales.
method to realize the sound of all amino acids analyzed using our An advantage of using the chemistry-based approach to define
method. We use a sound generation engine developed earlier19 and soundings of each amino acid is that the characteristic sound of each,
adapt it here for the synthesis of amino acid soundings, considering defined by the set of harmonic waves superpositioned to create the
the first 64 vibrational modes of each amino acid (higher-order modes audio, is self-consistent across all amino acids and that it naturally
beyond the audible spectrum are not considered). captures the differences between distinct amino acid vibrational
To translate the vibrational frequencies into audible sound, we spectra. This leads to a specific tonal characteristic, or timbre, of each
transpose the frequencies of molecular vibrations into the audible of the amino acids. Moreover, since the base frequency and all higher-
range by multiplying the frequencies, normalized by the lowest order contributions of each amino acid residue is different, as shown
frequency that is found in any of the 20 amino acids (it is seen in the
in Figure 2(b), the sound associated with each amino acid is distinct
first mode of TYR), by 61.74 Hz (corresponding to the B0 tone).
and has a reversible association with a musical note. These notes do
This translation process is based on the music theoretical concept of
transpositional equivalence, a feature of musical set theory.51 The not reflect the classical Western musical scales,51,52 but define their
choice of base frequency of 61.74 Hz is chosen based on the audible own natural scale innate in the vibrations of the amino acids.
frequency ranges, so that the resulting frequency spectra of all 20 Alternative approaches that have been defined as a means to map
amino acids is transposed to the audible range. We build the spectrum amino acid sequences to sound assign a certain classical note or chord
of higher-order frequencies on top of the lowest eigenfrequency, each to each amino acid residue.23 This earlier method maps the protein
represented by harmonic sine waves that are added to form the audio sequences into the framework of Western classical musical scales.
signal associated with each amino acid. This method of overlaying However, it does not capture the foundational vibrational character-
higher frequencies based on the particular spectrum of each amino istic of each protein, as it was predetermined to be expressed in
acid allows us to translate the frequencies into audible space and to classical Western scales.23 Our analysis suggests that there exists an
maintain the characteristic sound spectrum associated with each “amino acid scale” that is composed by 20 sounds.

7478 DOI: 10.1021/acsnano.9b02180


ACS Nano 2019, 13, 7471−7482
ACS Nano Article

Figure 8. Melodic range spectrogram over time for the sonified representation of lysozyme with PDB ID 194l (total duration of the music
analyzed is around 48 s). The figure depicts also a comparison with the secondary structure, revealing how certain acoustic patterns are
associated with certain secondary structures of the protein. (a) Frequency spectrum and (b) secondary structure over the sequence of the
protein (note that the time axis in panel (a) and sequence axis in panel (b) are not identical, since different secondary structures are
associated with different rhythms, leading to variations in time passed per amino acid).

The resulting audio recordings are analyzed using Sennheiser HD We use Sonic Visualizer (version 3.2.1)54 to analyze time histories
800 S high-resolution reference headphones (Sennheiser electronic of frequency patterns of the produced sounds to study the initial
GmbH & Co. KG, frequency response: 6−48 000 Hz (−10 dB)), as music and features representing certain secondary structures as well as
well spectrum analyzers implemented in Ableton Live and Max/MSP, a comparison of predicted musical features and the folded proteins.
to empirically confirm the predicted frequency ranges. The use of We apply the melodic spectrum analysis tool to represent the
reference headphones is important for the detailed analysis of the frequency spectrum (y-axis) over time (x-axis).
sounds produced. The wide and flat frequency spectrum of the Translating Protein Sequence into Musical Scores. Building
reference headphones used allows us to exactly hear minute changes a Musical Instrument. After generating the sound of each of the 20
of the tones generated in the broad frequency spectrum originating amino acids, we assign each of the amino acids to one key on a piano,
from the amino acid vibrations. using Ableton Live Sampler (a sampling instrument that allows one to
Spectral Analysis of Amino Acid and Protein Soundings. We play back audio recordings). We use an input device such as a MIDI
use Melodyne Studio 4.2.153 to analyze the note spectrum associated keyboard, Ableton Push, ROLI BLOCK, or similar devices that allow
with each of the 20 amino acids, mapped onto a piano roll. Using the convenient access to playing the instrument (the advantage of using
polyphonic detection mechanism in Melodyne (DNA Direct Note devices like Push or BLOCK is that they do not follow the traditional
Access technology), we analyze the sound of all 20 amino acids 12-tone piano roll setup with white and black keys but allow instead
(Melodyne and DNA Direct Note Access are registered trademarks of to be programmed to represent the 20-tone “amino acid scale”). One
Celemony Software GmbH). The method allows us to detect the can visualize the resulting musical instrument as a piano with 20 keys.
notes of which the sound of each of the amino acids is composed of, This setup allows one to play the musical instrument and use the
as well as the underlying musical parameters such as relative volume. amino acid soundings as a generative way to create musical
We use the scale detective function in Melodyne to find the best fit to complexity over time. For instance, the C3 key is mapped to ALA,
any of the musical scales to the amino acid soundings. the D3 key to ARG, and so on (for all 20 amino acids, on 20 distinct

7479 DOI: 10.1021/acsnano.9b02180


ACS Nano 2019, 13, 7471−7482
ACS Nano Article

keys). It is important to note that the proteins do not reflect the The sequence data are used for further analysis to examine
sound of what is commonly associated with C3, D3, etc. Instead, they similarities with known proteins and to build 3D models using protein
directly resemble the sound defined by their innate vibrational folding methods. To better understand the similarities of amino acid
spectrum as described above without any change in the frequency sequences, we use tools such as BLAST.57 To build 3D models of
spectrum or tonal characteristic. The mapping onto piano keys is proteins, homology methods58 or other protein folding approaches
done solely for convenience in the use of existing interactive are used. In the analysis reported here we use ORION58 to predict an
electronic music devices, the Digital Audio Workstation, and MIDI. estimated structure of the designed protein sequences, and a 3D
Mapping Amino Acid Sequences into Musical Scores. Another structure is obtained using MODELER,59 reflecting the images shown
way by which we exploit the sonification approach is to map amino in Table 3 (right column).
acid sequences into musical scores that reflect music composed in the Design Approaches. The translation to music and from music to
“amino acid scale”. Using bioinformatics libraries Biopython and protein sequence enables a seamless mapping between different
Biskit we developed a python script that translates any sequence into a manifestations of matter. It enables a design approach of sequences in
musical score. Sequences using the one-letter amino acid code can be either molecular space or musical space, or combinations thereof. For
entered either manually or based on lists of one or more protein PDB instance, musical compositions generated by humans or AIs can be
identifiers. We also implemented a function by which proteins can be analyzed in the protein space. Proteins can thereby be a source of
searched and grouped, using PyPDB. This allows one to build musical compositions and generate innovative concepts for artistic
complex musical scores (e.g., to generate music or for use as training expressions.
sets for the neural networks). Musical scores are stored as MIDI files Deep Learning to Generate Musical Scores with a
that can be accessed by DAWs and sonified using the Ableton Recurrent Neural Network Model. We use the musical scores
Sampler tool described above. generated from amino acid sequences and train a deep neural
To reflect higher-order chemical structure in the musical space, we network, using the Magenta framework developed by Google Brain,
incorporate information about the secondary structure associated with which is implemented in TensorFlow60 (https://magenta.tensorflow.
each amino acid in the translation step in affecting the duration and org/) (Google Brain and TensorFlow are trademarks of Google Inc.).
volume of notes played. We use DSSP to compute the secondary A recurrent neural network (RNN) we used for melody generation
structure from the protein geometry file and sequence.55,56 Table 4 is adopted from language modeling, which was implemented in the
Melody RNN model61 using TensorFlow.60 This RNN cell uses a
Table 4. Incorporation of Secondary Protein Structure in long short-term memory unit (LSTM) for time sequence featuring
the Translation into a Musical Score, Affecting Note Timing alongside an attention model.10 The attention model allows us to
access past information in the musical score and hence learn longer
and Note Volumea
term dependencies in musical note progression. To illustrate the
secondary structure note timing note volume approach, we develop two RNN models using several training sets
derived from collections of musical scores translated by the approach
β sheet (all types) 1.0 1
described above. We train the model using a batch size of 128, two
helices (α helix and others) 0.5 0.5
layers of RNN with 128 units each, and an attention length of 40 steps
random coil and unstructured 2.0 0.25 (2.5 bars). Training is done until convergence is achieved, typically
a
Different proteins are separated by a longer break. By classifying around 40 000 steps or less. The training and generations are done on
three major secondary structure classes we can capture their a Dell Precision Tower 7810 workstation (Xeon CPU E5-2660 v4 2.0
representation in musical space and also translate the feature into GHz, 32 GB memory with a GeForce RTX 2080 Ti GPU).
the AI. Figure 6 shows how these rules are reflected in the Training Set #1: β Sheet Rich Proteins. We use a training set
corresponding musical score. consisting of β-barrel protein structures and similar β sheet rich
proteins (PDB IDs 6CZG, 2YNK, 6CZJ, 6CZH, 6CZI, 2JMM, 6CZI,
2JMM, 6D0T, 3P1L, 2QOM, 1G7N, 4K3B, 5EE2, 5G38, 5G39,
lists the parameters determined by this approach. We propose using 5NJO, 6F SU, 1DC9, 2F1V, 2F1T, 5LDT, 2MXU, 2NNT, 3OW9,
longer note durations for disordered secondary structures, very short 2LNQ, 5KK3, 2MUS, 2M5M, 2M5K, 2LBU, 3ZPK, 6EKA, 5O65,
note durations for helices, and short notes for β-sheets. We also 2E8D, 2LMP, 2LMO, 4RIL, 2LMN, 2KJ3, 2RNM, 2LMQ, 5OQV,
modulate the volume by rendering β-sheets the loudest, and others 2M5N, 2KIB, 2BEG, 2M5N, 2KIB, 5O67, 2N0A, 6CU8, 6CU7,
more softly. For instances, ALA residues in a BS will be played louder 6CU8, 6FLT, 3LOZ, 4OLR; around 20 000 amino acid residues).
and slower than ALA residues in an AH, which will be played in a fast Training Set #2: α Helix Rich Proteins. We use a training set
and repetitive manner. Similarly, ALA residues in random coils or consisting of α helix rich proteins (PDB 6A9P, 6F62, 6F63, 6F64,
unstructured regions would be played slow and softly. These 6GAJ, 6GAK, 5VR2, 5TO5, 5TO7, 5XDJ, 5LBJ, 2NDK,
modulations of the tone by volume and timing lead to a certain 5WST,5IIV,5D3A, 5HHE, 2MG1, 2LBG,2L5R,3 V4Q, 2D3E,
rhythmic character that overall reflects the 3D folded geometry of the 2HN8, 2FXO, 3TNU, 4YV3, 1GK6, 3SSU, 3SWK, 2XV5, 3UF1,
protein. It is noted that, distinct from the way we obtained the 3PDY, 1X8Y, 3TNU, 4ZRY, 6E9R, 6E9T, 6E9X, 2MG1; around
frequency spectra of amino acids, the effect of secondary structure on 20 000 amino acid residues).
the musical score is not directly based on physical principles and Training Set #3: α Helix and β Sheet Rich Proteins. This training
involves choices. However, for the training of the neural networks, set includes all protein sequences from training sets #1 and #2
capturing these features is essential, as it reflects the hierarchical combined.
nature of the protein fold from primary, to secondary, to tertiary and Generation of Music. We use the trained neural network model
higher-order structures. to generate various musical scores. As seeds for musical score
Mapping Musical Scores into Protein Sequences and Protein generation we use either a set of notes (we seed it with two notes
Structure Analysis. To map musical scores back into amino acid reflecting the ALA and VAL notes) or an existing protein structure
sequences, we developed a script that reads a MIDI file and maps the taken from the PDB (the seeds used for each of the cases are
notes associated with the 20 amino acids back onto amino acids, described in Table 3). Using the synthesis method described above
generating sequence outputs in the one-letter codes. In the translation we sonify these musical scores, just as we sonified the naturally
of the musical scores back into amino acid sequences we solely occurring musical scores using the method described above. The seed
capture the sequence of amino acids. This serves as a means to test used acts as the basis for further note generation and thereby acts as a
the predictive power of the neural networks as to whether or not they template for the following sequences that are variations and
are capable of predicting proteins with the desired secondary evolutions of the notes represented in the seeds. This allows one to
structures. In principle, secondary structure information could be use variations in the seed to control the type of musical patterns
extracted from the musical scores as well. produced and, by extension, the type of protein designed. While we

7480 DOI: 10.1021/acsnano.9b02180


ACS Nano 2019, 13, 7471−7482
ACS Nano Article

have explored a variety of seeds in this paper (as shown in Table 3), (8) Buehler, M. J. Tu(r)ning Weakness to Wtrength. Nano Today
future studies could explore these relationships in greater detail. 2010, 5, 379.
We translate the AI-generated musical scores back to amino acid (9) Giesa, T.; Spivak, D. I.; Buehler, M. J. Reoccurring Patterns in
sequences for further analysis, as described above. Hierarchical Protein Materials and Music: The Power of Analogies.
Interactive Phone App Development. We use Android Studio Bionanoscience 2011, 1, 153−161.
(https://developer.android.com/) to create an app for Android (10) Bahdanau, D., Cho, K., Bengio, Y. Neural Machine Translation
phones (Android is a trademark of Google LLC). In order to play the by Jointly Learning to Align and Translate, arXiv:1409.0473, 2014.
sounds, we program a Java class and use the MediaPlayer library in (11) Huang, C.-Z. A., Vaswani, A., Uszkoreit, J., Shazeer, N., Simon,
Android Studio. We create a MediaPlayer object and use associated I., Hawthorne, C., Dai, A. M., Hoffman, M. D., Dinculescu, M., Eck,
attributes to play and stop the audio. In order to run the Java voids, D. Music Transformer, arXiv:1809.04281v3, 2018.
we used XML, as well as for formatting of the colors, text, and design. (12) Roberts, A., Engel, J., Raffel, C., Hawthorne, C., Eck, D. A
Audio files in the WAV format of all 20 amino acid sounds are used Hierarchical Latent Vector Model for Learning Long-Term Structure
for the app. The app is published for public download on the Google in Music, arXiv:1803.05428v4, 2018.
Play store (https://play.google.com/store/apps/details?id=com. (13) Tamerler, C.; Sarikaya, M. Genetically Designed Peptide-Based
synth.aminoacidplayer. The Java source code is attached in the SI. Molecular Materials. ACS Nano 2009, 3, 1606−1615.
(14) Peralta, M. D. R.; Karsai, A.; Ngo, A.; Sierra, C.; Fong, K. T.;
ASSOCIATED CONTENT Hayre, N. R.; Mirzaee, N.; Ravikumar, K. M.; Kluber, A. J.; Chen, X.;
*
S Supporting Information Liu, G.; Toney, M. D.; Singh, R. R.; Cox, D. L. Engineering Amyloid
The Supporting Information is available free of charge on the Fibrils from β-Solenoid Proteins for Biomaterials Applications. ACS
ACS Publications website at DOI: 10.1021/acsnano.9b02180. Nano 2015, 9, 449−463.
(15) Mathieu, F.; Liao, S.; Kopatsch, J.; Wang, T.; Mao, C.; Seeman,
Overview of all files included in SI.zip (PDF) N. C. Six-Helix Bundles Designed from DNA. Nano Lett. 2005, 5,
Data files, MP3 audio files, and Java code (ZIP) 661−665.
(16) Russ, M. Sound Synthesis and Sampling; Burlinton: Focal, 2009.
AUTHOR INFORMATION (17) Su, I.; Qin, Z.; Saraceno, T.; Krell, A.; Mühlethaler, R.; Bisshop,
A.; Buehler, M. J. Imaging and Analysis of a Three-Dimensional
Corresponding Author Spider Web Architecture. J. R. Soc., Interface 2018, 15, 20180193.
*E-mail: mbuehler@mit.edu. Tel: +1.617.452.2750. (18) Su, I., Qin, Z., Bisshop, A., Muehlethaler, R., Ziporyn, E.,
ORCID Buehler, M. J. Sonification of a 3D Spider Web and Reconstitution
Markus J. Buehler: 0000-0002-4173-9659 into Musical Composition using Granular Synthesis. Submitted.
(19) Qin, Z.; Buehler, M. J. Analysis of Molecular Vibrations of over
Author Contributions 100,000 Protein Structures, Sonification, and Application as a New
M.J.B. designed this research, in collaboration with Z.Q., Musical Instrument. Extrem. Mech. Lett. 2019, 29, 100460.
C.H.Y., and F.M.M. Z.Q. conducted the MD CHARMM (20) Yeo, J.; Jung, G.; Tarakanova, A.; Martín-Martínez, F. J.; Qin,
analysis of vibrational frequencies. M.J.B. and C.H.Y. Z.; Cheng, Y.; Zhang, Y.-W.; Buehler, M. J. Multiscale Modeling of
conducted the AI training and AI music generation. F.M.M. Keratin, Collagen, Elastin and Related Human Diseases: Perspectives
contributed the analysis of the DFT data. The paper was from Atomistic to Coarse-Grained Molecular Dynamics Simulations.
written by M.J.B with input from all coauthors. Extrem. Mech. Lett. 2018, 20, 112−124.
(21) Spivak, D. I.; Giesa, T.; Wood, E.; Buehler, M. J. Category
Notes
Theoretic Analysis of Hierarchical Protein Materials and Social
The authors declare no competing financial interest. Networks. PLoS One 2011, 6, 0023911.
(22) Brommer, D. B.; Giesa, T.; Spivak, D. I.; Buehler, M. J.
ACKNOWLEDGMENTS Categorical Prototyping: Incorporating Molecular Mechanisms into
This research was supported by ONR (grant # N00014-16-1- 3D Printing. Nanotechnology 2016, 27, 024002.
2333) and NIH U01 EB014976. We acknowledge E. L. (23) Takahashi, R.; Miller, J. H. Conversion of Amino-Acid
Buehler for help with python analysis scripts and designing the Sequence in Proteins to Classical music: Search for Auditory Patterns.
interactive phone app. We acknowledge the MIT Center for Genome Biol. 2007, 8, 405.
(24) Duncan, A. Combinatorial Music Theory. J. Audio Eng. Soc.
Art, Science & Technology (CAST) program for fruitful
1991, 39, 427−448.
discussions. The RNN models, training sets used for the (25) All The Scales (https://allthescales.org/, May 14, 2019).
generations, and associated PDB and MIDI files are available (26) Supper, A. Sublime frequencies: The Construction of Sublime
from the authors upon request. Listening Experiences in the Sonification of Scientific Data. Soc. Stud.
Sci. 2014, 44, 34−58.
REFERENCES (27) Dubus, G.; Bresin, R. A systematic Review of Mapping
(1) Bucur, V. Handbook of Materials for String Musical Instruments; Strategies for the Sonification of Physical Quantities. PLoS One 2013,
Springer International Publishing: New York, 2016. 8, No. e82491.
(2) Buehler, M. J. Materials by Design - A Perspective from Atoms (28) Delatour, T. Molecular Music: The Acoustic Conversion of
to Structures. MRS Bull. 2013, 38, 169−176. Molecular Vibrational Spectra. Comput. Music J. 2000, 24, 48−68.
(3) Hansen, U. J. Materials in Musical Instruments. J. Acoust. Soc. (29) Beese, A. M.; Sarkar, S.; Nair, A.; Naraghi, M.; An, Z.;
Am. 2011, 129, 2517−2518. Moravsky, A.; Loutfy, R. O.; Buehler, M. J.; Nguyen, S. T.; Espinosa,
(4) Hofstadter, D. R. Gödel, Escher, Bach: An Eternal Golden Braid. H. D. Bio-Inspired Carbon Nanotube-Polymer Composite Yarns with
Basic Books: New York, 1979. Hydrogen Bond-Mediated Lateral Interactions. ACS Nano 2013, 7,
(5) Osaki, S. Spider Silk Violin Strings with a Unique Packing 3434−3446.
Structure Generate a Soft and Profound Timbre. Phys. Rev. Lett. 2012, (30) Giesa, T.; Schuetz, R.; Fratzl, P.; Buehler, M. J.; Masic, A.
108, 154301. Unraveling the Molecular Requirements for Macroscopic Silk
(6) Wegst, U. G. K. Bamboo and Wood in Musical Instruments. Supercontraction. ACS Nano 2017, 11, 9750−9758.
Annu. Rev. Mater. Res. 2008, 38, 323−349. (31) Moon, J. H.; Oh, J. Y.; Kim, M. S. A systematic and Efficient
(7) Xenakis, I. Formalized Music: Thought and Mathematics in Method to Estimate the Vibrational Frequencies of Linear Peptide
Composition; Indiana University Press: Bloomington, 1971. and Protein Ions with any Amino Acid Sequence for the Calculation

7481 DOI: 10.1021/acsnano.9b02180


ACS Nano 2019, 13, 7471−7482
ACS Nano Article

of Rice-Ramsperger-Kassel-Marcus Rate Constant. J. Am. Soc. Mass (56) Joosten, R. P.; te Beek, T. A. H.; Krieger, E.; Hekkelman, M. L.;
Spectrom. 2006, 17, 1749−1757. Hooft, R. W. W.; Schneider, R.; Sander, C.; Vriend, G. A Series of
(32) Wong, M. W. Vibrational Frequency Prediction using Density PDB Related Databases for Everyday Needs. Nucleic Acids Res. 2011,
Functional Theory. Chem. Phys. Lett. 1996, 256, 391−399. 39, 411−419.
(33) Watson, T. M.; Hirst, J. D. Density Functional Theory (57) Altschul, S. F.; Gish, W.; Miller, W.; Myers, E. W.; Lipman, D.
Vibrational Frequencies of Amides and Amide Dimers. J. Phys. Chem. J. Basic Local Alignment Search Tol. J. Mol. Biol. 1990, 215, 403−410.
A 2002, 106, 7858−7867. (58) Ghouzam, Y.; Postic, G.; Guerin, P.-E.; de Brevern, A. G.;
(34) Barth, A.; Zscherp, C. What Vibrations Tell us About Proteins. Gelly, J.-C. ORION: A Web Server for Protein Fold Recognition and
Q. Rev. Biophys. 2002, 35, 369−430. Structure Prediction using Evolutionary Hybrid Profiles. Sci. Rep.
(35) Rischel, C.; Spiedel, D.; Ridge, J. P.; Jones, M. R.; Breton, J.; 2016, 6, 28268.
Lambry, J.-C.; Martin, J.-L.; Vos, M. H. Low Frequency Vibrational (59) Eswar, N.; Webb, B.; Marti-Renom, M. A.; Madhusudhan, M.
Modes in Proteins: Changes Induced by Point-mutations in the S.; Eramian, D.; Shen, M.; Pieper, U.; Sali, A. Comparative Protein
Protein-Cofactor Matrix of Bacterial Reaction Centers. Proc. Natl. Structure Modeling Using Modeller. Curr. Protoc. Bioinform. 2006, 15,
Acad. Sci. U. S. A. 1998, 95, 12306−12311. 5.6.1−5.6.30.
(36) Patodia, S., Bagaria, A., Chopra, D. Molecular Dynamics (60) Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J.,
Simulation of Proteins: A Brief Overview. J. Phys. Chem. Biophys. Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg,
2014, 4, DOI: 10.4172/2161-0398.1000166. J., Monga, R., Moore, S., Murray, D. G., Steiner, B., Tucker, P.,
(37) Smythies, J. On the Possible Role of Protein Vibrations in Vasudevan, V., Warden, P., Wicke, M., Yu, Y., Zheng, X., Brain, G.
Information Processing in the Brain: Three Russian Dolls. Front. Mol. TensorFlow: A System for Large-Scale Machine Learning. In
Neurosci. 2015, 8, DOI: 10.3389/fnmol.2015.00038. Proceedings of the 12th USENIX Symposium on Operating Systems
(38) Ghosh, R.; Mishra, R. C.; Choi, B.; Kwon, Y. S.; Bae, D. W.; Design and Implementation (OSDI ’16); USENIX Association:
Park, S.-C.; Jeong, M.-J.; Bae, H. Exposure to Sound Vibrations Lead Berkeley, 2016; pp 265−283.
to Transcriptomic, Proteomic and Hormonal Changes in Arabidopsis. (61) Waite, E.; Eck, D.; Roberts, A.; Abolafia, D. Project Magenta:
Sci. Rep. 2016, 6, 33370. Generating Long-Term Structure in Songs and Stories, https://magenta.
(39) Hassanien, R. H. E.; Tian-Zhen, H.; Li, Y.-F.; Li, B.-M. tensorflow.org/2016/07/15/lookback-rnn-attention-rnn, 2016 (May
Advances in Effects of Sound Waves on Plants. J. Integr. Agric. 2014, 14, 2019).
13, 335−348.
(40) Fernandez-Jaramillo, A. A.; Duarte-Galvan, C.; Garcia-Mier, L.;
Jimenez-Garcia, S. N.; Contreras-Medina, L. M. Effects of Acoustic
Waves on Plants: An Agricultural, Ecological, Molecular and
Biochemical Perspective. Sci. Hortic. 2018, 235, 340−348.
(41) Al-Shahib, A.; Breitling, R.; Gilbert, D. R. Predicting Protein
Function by Machine Learning on Amino Acid Sequences − a critical
evaluation. BMC Genomics 2007, 8, 12051.
(42) Mirabello, C., Wallner, B. rawMSA: Proper Deep Learning
Makes Protein Sequence Profiles and Feature Extraction Obsolete,
bioRxiv 394437, 2018.
(43) Hou, J.; Adhikari, B.; Cheng, J. DeepSF: Deep Convolutional
Neural Network for Mapping Protein Sequences to Folds.
Bioinformatics 2018, 34, 1295−1303.
(44) Wang, J.; Cao, H.; Zhang, J. Z. H.; Qi, Y. Computational
Protein Design with Deep Learning Neural Networks. Sci. Rep. 2018,
8, 6349.
(45) Gu, G. X.; Chen, C.-T.; Buehler, M. J. De Novo Composite
Design Based on Machine Learning Algorithm. Extrem. Mech. Lett.
2018, 18, 19−28.
(46) Gu, G. X.; Chen, C.-T.; Richmond, D. J.; Buehler, M. J.
Bioinspired Hierarchical Composite Design Using Machine Learning:
Simulation, Additive manufacturing, and Experiment. Mater. Horiz.
2018, 5, 939−945.
(47) Calvaresi, M.; Zerbetto, F. Baiting Proteins with C 60. ACS
Nano 2010, 4, 2283−2299.
(48) Elsea, P. The Art and Technique of Electroacoustic Music; A-R
Editions: Middleton, 2013.
(49) Cycling ’74 Max 8, https://cycling74.com/ (May 14, 2019).
(50) Ableton Live Digital Audio Workstation, https://www.ableton.
com/en/live/ (May 14, 2019).
(51) Schuijer, M. Analyzing Atonal Music: Pitch-Class Set Theory and
its Contexts; University of Rochester Press: Rochester, 2008.
(52) Forte, A. The Structure of Atonal Music; Yale University
Press:New Haven, 1973.
(53) Melodyne Studio, https://www.celemony.com/en/melodyne
(May 14, 2019).
(54) Cannam, C., Landone, C., Sandler, M. Sonic Visualiser. In
Proceedings of the International Conference on Multimedia - MM ’10;
ACM Press: New York, 2010; pp 1467−1468.
(55) Kabsch, W.; Sander, C. Dictionary of Protein Secondary
Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical
Features. Biopolymers 1983, 22, 2577−2637.

7482 DOI: 10.1021/acsnano.9b02180


ACS Nano 2019, 13, 7471−7482

You might also like