CNV boundaries were estimated based on the probe (log2) ratio information from tilling array CGH. PCR primers were then designed to amplify the breakpoints of five de novo deletions. PCR was performed on genomic DNA from all members of the trio. PCR products were sequenced by the Sanger selleck screening library method using both forward and reverse primers specific for each de novo deletion. We examined whether genes impacted by de novo CNVs in SCZ, BD, and controls were enriched for specific functional
categories. In addition, functional categories found to be enriched within each diagnostic group were interrogated in rare CNVs from large independent cases control data sets including 8,290 SCZ, 2,777 BD, and 7,431 controls. For gene set enrichment analysis, we used 39 de novo CNVs including nine in SCZ, www.selleckchem.com/products/gsk126.html ten in BD, four in our controls, and an additional 16 CNVs detected in a previous study by Levy et al. (2011) in an independent set of control subjects using the same array platform. We prefer to use only de novo CNVs as a control set. Naturally occurring variants in the population
do not make the ideal control set for this analysis because the gene content of these CNVs is shaped by natural selection and is not likely to be representative of random mutation. Gene set enrichment analyses was performed on the sets of genes impacted by de novo CNVs in SCZ, BD, and controls. The primary step was performed using “DAVID Bioinformatics Resources 6.7” website (http://david.abcc.ncifcrf.gov/) using Gene Ontology terms—biological processes (GO_BP), cellular components (GO_CC), and molecular functions (GO_MF)—including KEGG, Phosphoprotein phosphatase BioCarta, BBID, and Panther pathway databases and by excluding pathway results containing < 3 CNV genes. We selected the nonredundant pathways from DAVID with p value < 0.05 for further analysis by permutation-based test. Based on analysis using DAVID, eight categories were enriched among de novo CNVs in SCZ (Table 4), seven categories were enriched among de novo CNVs in BD (Table 5), and nine categories were enriched among de novo CNVs in controls (Table S7). The enrichment test performed
within the DAVID software does not correct for certain biases of CNVs toward certain functional classes of genes and large genes in particular. In order to correct for these biases we applied two permutation-based tests to the pathways found to be enriched by DAVID. First, we performed a case-only permutation-based test by constructing empirical null distributions that took the CNV size distribution and gene number into account. We randomly placed 10,000 sets of CNVs (same number of events, size distribution) throughout the genome. Placement on any autosome was allowed, but we sampled such that placement on chromosomes was weighted in proportion to the total number of de novo CNVs observed on the respective chromosome.