Mutations+in+Tumor+Suppressor+Genes+for+Various+Cancers

  ﻿ Tumor Suppressor Gene Mutations-- A Comparative Study
At the beginning of our research project, we found an article-- [|Osteosarcoma and Retinoblastoma: A Shared Chromosomal Mechanism Revealing Recessive Predisposition]-- that we thought was very interesting. The article mentioned that people with hereditary retinoblastoma are more likely to develop osteosarcoma later on in their life, even if they have been successfully treated for retinoblastoma; the developed osteosarcoma is not a metastasis from the eye, but is a primary tumor. We were fascinated with this information, and we were hit with the curiosity to find the link between osteosarcoma and retinoblastoma. We asked, "What links the two seemingly unrelated cancers together? What relationship does hereditary retinoblastoma have with osteosarcoma so that osteosarcoma is often found in patients with retinoblastoma?" Upon looking through some scientific literature and talking to Professor Islas, our question was answered. It turned our that the tumor suppressor gene Retinoblastoma 1 was responsible for the connection between the two cancers.

Finding an answer to our initial question led us to another one. Are there any other tumor suppressor genes that might explain the link between retinoblastoma and osteosarcoma? We thought this would be an interesting and novel research to do for our project, so we set up our research by examining two other well-known tumor suppressor genes-- TP53 and CDKN2a-- on top of Rb1 to see whether they played a role in the connection between retinoblastoma and osteosarcoma. For our methods at looking for connections, we decided it would be best to infer connections from analyzing the mutations that occurred to the tumor suppressor genes. In other words, we hypothesized that retinoblastoma and osteosarcoma are related because certain specific mutations in the tumor suppressor genes can cause both cancers. Therefore, the research we formed involved analyzing mutations in Rb1, TP53, and CDKN2a, for retinoblastoma tumors and osteosarcoma tumors, as well as in lung cancer tumors, which we deemed as the control for comparison of results. Lung cancer was chosen as the control because its tumors had mutations in all three of the tumor suppressor genes. Unfortunately, when we acutally carried out the research, we discovered it yielded no qualitative results we could actually analyze or use.

Our research topic was once again modified and revised. Instead of looking at three cancers, we changed the number to six cancers to give us a greater sample size. The six cancers we chose were central nervous system cancer, liver cancer, lung cancer, osteosarcoma, retinoblastoma, and urinary cancer. These cancers has mutations in at least two, if not all three, tumor suppressor genes. This time, the purpose of the project was to analyze tumor suppressor genes and their mutations in various cancers to discover if there are any connections between the cancers and the tumor suppressor genes, rather than just looking at connections between retinoblastoma and osteosarcoma.

Our finalized research question asks: What kinds of mutations are found in the major tumor suppressor genes of Rb1, TP53, and CDKN2a for common cancers including central nervous system cancer, liver cancer, lung cancer, osteosarcoma, retinoblastoma, and urinary cancer?

 **Introduction**
Tumor suppressor genes, such as Rb1, TP53, and CDKN2a, code for proteins that maintain normal cell growth and proliferation by placing tight controls over the cell cycle. Mutation or disruption of tumor suppressor genes, however, inactivates their proteins and can result in uncontrolled cell growth and ultimately cancer development. Mutations in tumor suppressor genes cause cancers that include, but are not limited to, central nervous system cancer,liver cancer, lung cancer, osteosarcoma, retinoblastoma, and urinary cancer.

In our research, we perform a comparative study of the types of genetic mutations in the three tumor suppressor genes for the six cancers. We aim at answering the question mentioned above-- what kinds of mutations are found in the major tumor suppressor genes of Rb1, TP53, and CDKN2a for common cancers including central nervous system cancer, liver cancer, lung cancer, osteosarcoma, retinoblastoma, and urinary cancer?-- and we try to conclude with a correlation between the mutations and the cancers.

=
The [|Catalogue of Somatic Mutations in Cancer] is a large database that contains collections of tumor samples analyzed for their genetic ======

=
(somatic) mutations; the information from COSMIC includes mutation spectra and related details of tumor samples. Because our project looks ======

=
at mutations in Rb1, TP53, and CDKN2a in CNS, liver, lung, osteosarcoma, retinoblastoma, and urinary cancer, COSMIC was not only the ======

=
As mentioned above, COSMIC provides information in the form of tables and mutation spectra. To interpret information from tables, it is ======

=
necessary to understand how to read amino acid location and mutation and nucleotide mutations. The following is an example. //AA Mutation// ======

=
stands for Amino Acid Mutation, where p.C706F means that at amino acid position 706, a mutation from C to F occurred. //CDS Mutation// ======

=
shows the nucleotide mutation, where c.2117G>T means that at nucleotide position 2117, a G (guanine) was substituted for a T (thymine). ======



Amino acids are all abbreviated with single letter abbreviations, and these abbreviations will be used throughout our project as well. A table of the single letter abbreviations is provided below.

**Table of Amino Acid Abbreviations**
 * = **Single Letter Abbreviation** ||= **Amino Acid** ||
 * = A ||= Alanine ||
 * = C ||= Cysteine ||
 * = D ||= Aspartic Acid ||
 * = E ||= Glutamic Acid ||
 * = F ||= Phenylalanine ||
 * = G ||= Glycine ||
 * = H ||= Histidine ||
 * = I ||= Isoleucine ||
 * = K ||= Lysine ||
 * = L ||= Leucine ||
 * = M ||= Methionine ||
 * = N ||= Asparagine ||
 * = P ||= Proline ||
 * = Q ||= Glutamine ||
 * = R ||= Arginine ||
 * = S ||= Serine ||
 * = T ||= Threonine ||
 * = V ||= Valine ||
 * = W ||= Tryptophan ||
 * = Y ||= Tyrosine ||

Interpreting mutation spectra is much more self explanatory. The following is an example of a mutation spectrum scaled to the gene's amino acid length. The histogram represents substitution mutations, where bar lengths reflect frequency of occurrence. The blue triangles represent deletions, while the red triangles represent additions. The rainbow rectangles represent frameshift deletions or additions. The locations where the bars, triangles, and rainbow rectangles fall represent their mutation location on the gene. Clicking on the mutations will link to detailed mutation descriptions shown earlier.



The COSMIC [|Mutation Page] provides more explanations of interpreting data and of navigating the website.

=
Data retrieved from COSMIC were analyzed in two ways. In the first way, we looked through all the mutations for the three genes and six ======

=
cancers and recorded the highest frequency (%) mutations in each gene for each cancer. Highest frequency (%) mutations are defined as the ======

=
top-five highest occurring mutations in one tumor suppressor gene for one type of cancer. In some cases, five were not attainable as a result of ======

=
insufficient tumor data provided by COSMIC; in other cases, more than five were considered as highest frequency mutations as a result of ties ======

** Finding Common Mutations **
===<span style="font-family: Arial,Helvetica,sans-serif; font-size: 13px; font-weight: normal; line-height: 19px;">In the second way, we looked through all the mutation spectra for the three genes and six cancers for mutations in tumor suppressor genes that are common to all the analyzed cancers. For cancer tumors (retinoblastoma and liver) not available to COSMIC, we excluded them from our analysis and only considered the available samples. ===

<span style="color: #1260bf; font-family: 'Comic Sans MS',cursive; text-align: center;"> **Results**
Analyses of the genetic mutations in Rb1, TP53, and CDKN2a in CNS cancer, liver cancer, lung cancer, osteosarcoma, retinoblastoma, and urinary cancer tumor samples provided by COSMIC yield the following results. Here, we present our data by organizing them into two parts. **Part I** provides a summary of the amino acid locations of the highest-frequency mutations in the three tumor suppressor genes for the six cancers. Clicking on the amino acid sequences will link to more detailed descriptions of the high-frequency mutations as well as complete mutation spectra. **Part II** displays the specific mutations in each tumor suppressor gene that is common to all six cancers.

<span style="color: #1260bf; font-family: 'Comic Sans MS',cursive; text-align: center;"> Part I. Highest Frequency Mutations
As mentioned above, this part gives a summary of the amino acid locations of the highest-frequency mutations in the three tumor suppressor genes for the six cancers. For highest-frequency mutations that do not have data for all six cancers, this is a result of COSMIC not having tumor sample data for us to collect. Clicking on the amino acid sequences will link to more detailed descriptions of the high-frequency mutations as well as complete mutation spectra.

<span style="color: #000000; font-family: 'Comic Sans MS',cursive;">**Highest Frequency Mutations in Rb1**


<span style="font-family: 'Comic Sans MS',cursive; font-size: 14px; line-height: 21px;">**Highest Frequency Mutations in TP53**

<span style="font-family: 'Comic Sans MS',cursive; font-size: 14px; font-weight: 800; line-height: 21px;">Highest Frequency Mutations in CDKN2a

<span style="color: #1260bf; font-family: 'Comic Sans MS',cursive; font-size: 14px; line-height: 21px; text-align: center;"> Part II. Mutations Common to All Cancers
Comparing the highest frequency mutation figures as well the the complete mutation spectra fro Part I, we can see that there is no one mutation in Rb1 that is common to all cancers, nor is there a mutation in TP53 that is common to all cancers.

In CDKN2a, however, there are two mutations, both of which are deletions, that all six cancers have. The first is a CDKN2a gene deletion from nucleotide position 1 to 471, where 471 nucleotide base pairs are deleted ( c.1_147del471 ). The second mutation is a partial gene deletion from nucleotide position 151 to 471, where 307 nucleotide base pairs are deleted ( c.151_457del307 ). In the following, tables will illustrate the details of these two common mutations.

**<span style="color: #000000; font-family: 'Comic Sans MS',cursive;"><span style="color: #000000; font-family: 'Comic Sans MS',cursive;">Common Mutations in CDKN2a **
**Deletion Mutation:** c.1_471del471
 * = **Cancer Type** ||= **Number of Samples** ||= **Frequency** ||
 * = Central Nervous System Cancer ||= 112 ||= 43.6% ||
 * = Liver Cancer ||= 3 ||= 3.61% ||
 * = Lung Cancer ||= 103 ||= 34.9% ||
 * = Osteosarcoma ||= 25 ||= 80.6% ||
 * = Urinary Cancer ||= 25 ||= 27.5% ||

**Deletion Mutation:** c.151_457del307
 * = **Cancer Type** ||= **Number of Samples** ||= **Frequency** ||
 * = Central Nervous System Cancer ||= 67 ||= 26.1% ||
 * = Liver Cancer ||= 44 ||= 53.0% ||
 * = Lung Cancer ||= 7 ||= 2.73% ||
 * = Osteosarcoma ||= 5 ||= 16.1% ||
 * = Urinary Cancer ||= 43 ||= 47.3% ||

<span style="color: #1260bf; font-family: 'Comic Sans MS',cursive; text-align: center;"> **Discussion**
Looking at the raw data found from our research, our initial approach was to look for patterns by analyzing and comparing mutations based on their specific amino acid locations. We focused on looking for common mutations-- same mutations that occurred in the tumor suppressor genes that appeared in all the cancers. For example, when looking at Rb1 mutations, we expected to find a substitution mutation at amino acid location 157 (p.157) for all tumor samples. Or, when looking at TP53, we expected to find a substitution mutation at amino acid location 249 (p.249) for all tumor samples. Findings like these would be considered significant. Unfortunately, none of our expectations were met other than in CDKN2a--where whole and partial gene deletions were seen in all the cancers. Each cancer type seemed to have its own unique pattern of mutation in terms of mutation type and location, so we found it very difficult to give any generalizations about common mutations. Our approach left us unable to tie the cancers and their mutations together in a comparative manner. We were stuck at reproducing results taken from COSMIC and were unable to draw further implications.

As a result, we decided to change our approach at analyzing our data. We decided that instead looking at specific mutations based on their amino acid locations //relative to the genes//, we would evaluate the mutations based on their locations //relative to the domains// present in the genes. This, we thought, would be a much more logical approach that would produce equally significant results. We reasoned that the chances for mutations of the tumor samples to fall in the same domain on the gene would be much greater than the chances of the mutations of the tumor samples to fall on the exact same amino acid location on the gene. Hence, our discussion of our results in the following sections focuses on mutations relative to their locations on domains on the tumor suppressor genes. We start by explaining the different domains found on Rb1, TP53, and CDKN2a. Next, we present three approaches to analyze our results. The first approach employs data from the highest frequency mutations in the 3 tumor suppressor genes for the 6 cancers and shows where these mutations fall on the domains. The second approach builds on the first one and looks for common mutations in terms of whether mutations of the cancers fall on the same domains or not. In the third approach, while it incorporates the results found from our research, it takes a step back to include mutations from all cancer types and looks at their locations relative to the domains. The three approaches allow us to draw conclusions of which domains in the tumor suppressor genes are most susceptible to mutations for the six specific cancers as well as for all the cancer types that affect humans.

<span style="font-family: 'Comic Sans MS',cursive; font-size: 14px; line-height: 21px;">Rb1
In the Rb1 protein, there are three main domains that are identified and have been studied. The first, is the Rb_A superfamily, Rb_A for short. Rb_A also referred to as the N-terminus of Rb1; and although not much information is available to describe its form and function, Rb_A is believed to promote the active pRb conformation. Rb_B superfamily, Rb_B for short, is the second major domain. It is often referred to as the A/B pocket, in which it forms the repressor motif. The A portion of the A/B pocket is a supportive structure for the B portion of the pocket. Together, the A/B pocket works to promote proper folding of pRb. Lastly, the third domain is Rb_C superfamily, Rb_C for short. It constitutes the C-terminus of Rb1 and makes up the cyclin binding motif, where phosphorylation and dephosphorylation of pRb is controlled . The domain labeled DUF3452 in blue represents a domain of unknown function.

While all three domains may have their own specific function, they interact with each other substantially to sustain a full functioning Rb1 protein. Below is a figure illustrating the domains of Rb1 and their relative amino acid locations on the gene. Please refer to What is the Rb1 gene? page for a brief review of Rb1 function.



<span style="color: #000000; font-family: 'Comic Sans MS',cursive;">**TP53**
In the TP53 gene there are a total of seven domains; however, two domains are major and will be discussed in our project. The first domain is the DNA binding domain, which allows TP53 to bind directly to DNA sequences and activate expression of downstream genes to regulate cell proliferation and apoptosis. The second domain is the tetramerization motif which, as its name implies, helps TP53 monomers to aggregate into tetramers. Recall, TP53 can only function as a transcription factor in the form of a tetramer made of identical TP53 proteins. Below is a figure illustrating the two major domains of TP53 and their relative amino acid locations. Please refer to //What is the TP53 gene?// for a brief review of TP53 function.



<span style="color: #000000; font-family: 'Comic Sans MS',cursive;">**CDKN2a**
In the CDKN2a gene, there is one large domain known as the ankyrin (ANK) repeats. The ANK repeats can range from 2 to over 20 ankryins, depending on the cell type. ANK mediates protein-protein interactions. Below is a figure illustrating the ANK domain and its common amino acid location. Please refer to What is the CDKN2a gene? for a brief review of the CDKN2a function.

<span style="background-color: #ffffff; color: #1260bf; font-family: 'Comic Sans MS',cursive; font-size: 14px; vertical-align: baseline;">** First Approach **
When looking at mutations for each cancer in each gene, we discovered some interesting results when we attributed mutations to the domains they fell on. To look at domain mutations, we used data from the highest-frequency mutations in the <span style="background-color: #ffffff; color: #000000; font-family: Arial,Helvetica,sans-serif; font-size: 12px; vertical-align: baseline;">// Results // section.

<span style="color: #000000; font-family: 'Comic Sans MS',cursive;">**Rb1**
In the Rb1 protein, highest-frequency mutations for CNS cancer occurred on Rb_A and Rb_B domains; for lung cancer on Rb_B; for osteosarcoma on Rb_A and Rb_C domains; for retinoblastoma on Rb_A; and for urinary cancer on Rb_B. The image below compares the locations of mutations in Rb1 in the various cancers to the domains presents in Rb1.

As you can see, mutations in Rb1 is not occurring in just one domain, but in all three domains. All three domains in Rb1 help promote the conformation of the protein. If a mutation occurs in one of the domains, then the protein's structure will be altered leading to a different conformation of the Rb1 protein, pRb. 2 A different conformation of pRb means the function of pRb is also changed. When mutations that occur on Rb1 domains cause pRb to be stuck in its phosphorylated state, pRb will no longer act as a tumor suppressor protein. Change of the protein function could mean that the cell loses a vital checkpoint, leading the cell cycle to go on an endless growth and proliferation.

<span style="color: #000000; font-family: 'Comic Sans MS',cursive;">**TP53**
In the TP53 protein, highest-frequency mutations for CNS cancer, liver cancer, lung cancer, osteosarcoma, and urinary cancer occur on the DNA dinding domain. The image below compares the locations of mutations in TP53 in the various cancers to the domains presents in TP53.

The DNA binding domain of TP53 is the site that TP53 binds directly to DNA sequences and activate expression of downstream genes to inhibit growth or start apoptosis. 3 Mutations in this domain could cause the DNA binding domain to no longer be activated when the site is bound which means that cell growth is no longer inhibited. So the cell can over proliferate. Mutations in the DNA binding domain of TP53 leads to evasion of programmed cell death.

<span style="color: #000000; font-family: 'Comic Sans MS',cursive;">**CDKN2a**
In the CDKN2a protein, aside from complete and partial gene deletions, highest-frequency mutations for CNS cancer, lung cancer, urinary cancer, liver cancer, and osteosarcoma occur in the ANK domain. The image below compares the locations of mutations in CDKN2a in the various cancers to the domains presents in CDKN2a. The function of the ANK is that they mediate protein to protein interactions. CDKN2a is a CDK inhibitor, specifically CDK4 and CDK6 of the cell cycle. 4 It acts as a negative regulator of the proliferation of normal cells. Since the ANK in the CDKN2a mediates protein to protein interactions, if it is mutated then ANK can no longer mediate protein to protein interactions. This means that CDKN2a can no longer interact with CDK and inhibit its function; so CDKN2a can no longer regulate the proliferation of the cell. Now the cell is free to over proliferate and form a mass.

===<span style="color: #19f01a; font-family: 'Comic Sans MS',cursive; text-align: center;"><span style="color: #1260bf; display: block; font-family: 'Comic Sans MS',cursive; text-align: center;">**Second Approach** ===

Trying to find common mutations in Rb1, TP53, and CDKN2a from central nervous system, liver, lung, osteosarcoma, retinoblastoma, and urinary cancer tumors yielded interesting results as well. In one aspect, it seemed as if there were no correlation between the cancers and their mutations due to the lack of data that showed common mutations in all cancers. As mentioned above in the //Results// section, only the mutations in CDKN2a were found throughout all the cancers; there were no similar mutations in Rb1 and TP53 in all the cancers. In essence, a superficial conclusion that CNS, liver, lung, osteosarcoma, retinoblastoma, and urinary cancers all have unrelated, unique genetic causes could have been reached.

However, when we decided to look at common mutations in terms of which domains on the genes were mutated, we saw correlations. Referring to the images above in the First Approach section, there are no common domain mutations for Rb1 in the cancers. Mutations seem to fall along the entire stretch of the Rb1 gene, and no particular domain holds the most mutations. In TP53, however, it is evident that all mutations fall on the DNA binding domain. This commonality among the cancers analyzed is very significant, as it implies the crucial function of the DNA binding domain in TP53. In CDKN2a, aside from the gene deletion common to all the cancers, all other mutations fell on the ANK domain of CDKN2a. Once again, this commonality among the cancers imply the crucial function of ANK in CDKN2a.

The second approach illustrates a correlation between the locations of mutations in the tumor suppressor genes and the cancers. Looking at mutations in domains proved to be more meaningful than merely looking at amino acid mutations.

===<span style="color: #19f01a; font-family: 'Comic Sans MS',cursive; text-align: center;"><span style="color: #1260bf; display: block; font-family: 'Comic Sans MS',cursive; text-align: center;">**Third Approach** ===

The last approach in assessing the data is to focus on the entire mutation spectra of the three tumor suppressor genes for all cancers types-- ranging from adrenal gland cancer, to cervical cancer, to large intestine cancer, to skin cancer and so on. In the following, we look at mutation spectra of Rb1, TP53, and CDKN2a for all cancers.

<span style="color: #000000; font-family: 'Comic Sans MS',cursive;">**Rb1**


As mentioned before Rb1 gene has three domains: Rb_A, Rb_B, and Rb_C. 2 Looking at the spectrum above, mutations are very well spread out throughout the Rb1 gene. Mutations are not clustered on one specific domain. This suggests Rb1's vulnerability to mutations, as mutations that occur in any of the three domains can cause structural changes to pRb that ultimately affect its function and ability to suppressor cell growth and proliferation.

<span style="color: #000000; font-family: 'Comic Sans MS',cursive;">**TP53**


Looking at the TP53 mutation spectrum, there appears to be a high frequency of mutation sites in the middle of the gene. Comparing this observation to the figure of TP53 domains below the spectrum, it becomes evident that the mutations are falling on the DNA binding domain. Few mutations fall on the tetramerization domain. This suggests the crucial role of the DNA binding domain to TP53's function. One again, the DNA binding domain is the site that allows TP53 to bind directly to DNA sequences and activate expression of downstream genes to regulate cell growth. 3 Mutation sites in the DNA binding domain affects the activation of genes that trigger cells to halt growth or to undergo apoptosis.

<span style="color: #000000; font-family: 'Comic Sans MS',cursive;">**CDKN2a**


The mutation spectrum of CDKN2a has a higher frequency of mutation sites in the middle region of the gene compared to other regions. This region is also the location of the ANK domain. Mutations in the ANK domain cause the CDKN2a protein to lose its function in interacting with other proteins. ANK is the inhibitor function of the CDKN2a. If ANK is muated, CDKN2a can no longer function as a CDK inhibitor, and regulation of cell growth and proliferation can no longer be achieved.

All three approaches we used to discuss our results ultimately underscore one important concept-- mutations in tumor suppressor genes are significant in the malfunction they cause to protein domains, which then lead to malfunction in tumor suppressor proteins as a whole. In the following section, we conclude with elaboration of this concept, in which we believe most adequately answers our research question.

<span style="color: #0000ff; font-family: 'Comic Sans MS',cursive; text-align: center;"> Conclusion
Through this research, we learned that locations of the mutations are important in the aspect where they fall on the domain on the tumor suppressor gene. When trying to find mutation locations for CNS cancer, liver cancer, lung cancer, osteosarcoma, retinoblastoma, and urinary cancer on the tumor suppressor gene not much information was found. At the first we could not understand why there was no more information on location mutations. Then we realized that looking at specific mutations in specific locations was too limited and would give no implications. So, we broaden our research on how to assess the data. Instead of looking for specific mutations at specific locations, we looked at all the mutations for the six cancers on the domains of the tumor suppressor genes. From doing that we found interesting results that suggests the importance of the domains in the protein. We noticed that mutations in the domains leads to the protein have changes in its conformation which ultimately effects the protein's function. To the effect to the protein's function typically means the protein can no longer play its role in the cell cycle. These interesting results caused us to wonder if it is just true to six cancers we are focusing, so broaden our search to various cancers that COSMIC has to offer on the mutations of the tumor suppressor genes. From broadening our search, the results were the same; that most mutations does occur in the domain's of the protein. This makes sense because for cancer cells want to lose the tumor suppressor genes to lose their function or the backup the cell has. Once the backup for the cell is gone, the cell is free to roam and become a cancer cell.

What we learned from about the function of the tumor suppressor genes and how they become cancerous helped us understand what caner is. However, we did not understand why we could not find specific mutations at specific locations for CNS cancer, liver cancer, lung cancer, osteosarcoma, retinoblastoma, and urinary cancer. The chances of getting a specific mutations at specific locations for the six cancers are small, so when looking at the whole domains we increased the chances of finding specific mutations. Also this explains why cancer is possible because mutations on the domains means cancer, since the domains are large then that increases the chances of getting cancer than mutations on the specific locations.

Future directions for the research is to find out how many dysfunctional tumor suppressor genes could cause cancer. Does one dysfunctional tumor suppressor gene means the person has caner cells or does it mean that it increases the person's chances of getting cancer cells. This would be the next approach for us.

Don't delete this Tp53 C Li L O U CDKN2a C Li L O U