## Chapter 6

### 6.1 Kinds of Modeling Commands

**SOLAR** has four major high level genetic modeling commands:

These commands (which are actually **SOLAR/Tcl** scripts) create, maximize, and compare many models automatically. Modeling can also be done more manually using the elementary SOLAR commands. The basic command underlying all the large analysis scripts is **maximize** which adjusts model parameter values to achieve the *maximum likelihood*.

In rare cases, **SOLAR** may be unable to maximize models for which the data are barely adequate, or the *curvature of the likelihood surface* exceeds the numerical limits. The loglikelihood and/or the parameter values may go to illegal numerical values. This is called *Convergence Failure*. Techniques for overcoming convergence failure are discussed below in Section 6.8.

### 6.2 Quantitative Genetic or Polygenic Analysis

*Quantitative Genetic* or *Polygenic analysis* can be done using the **polygenic ** command. First, you must load the pedigree and phenotypes file (if you have not done so already), specify your trait and specify your covariates. Then you can run **polygenic** . (If you have created any other models before, you should give the **model new** command first.)

Covariates may include *sex* (found in the pedigree file) and any of the variables included in the phenotypes file. Variables may be raised to any power using the **^** operator (for example, **age^2** is *age squared*). Interactions are specified using the ***** operator (so **age*sex** is the interaction of age and sex, usually called *age by sex*). You may list all the covariates in one command (as above) and/or use several commands; the list of covariates is cumulative. If you simply want to include ALL the variables (other than the trait) in the phenotypes file as covariates, you may use the **automodel ** or **allcovar ** commands (which will also include all the age and sex interactions and squares as shown above). In the covariate command, the pound sign ( **#** ) is a shortcut for *both variables and their interactions*, and commas can be used to separate multiple exponents. So the above list of 5 covariates could be abbreviated like this:

The polygenic command will calculate H2r (the *polygenic heritability*), the *significance* of H2r, and the *proportion of variance caused by the covariates*. The coefficients of the covariates are the values of the parameters in the model itself. You can display the entire *current model* with the command **model** , or just the parameters with the command **parameters** . If we wanted to just see the value of one parameter, **bage** (the *beta coefficient* or *parameter* for age), we could use the command:

To see the calculated *standard error* for this parameter, we use the selector **se** :

A popular option, covariate screening, can be added to this with the **-screen** argument:

( **-screen** may be abbreviated as **-s** )

Covariate screening will determine the statistical *significance* of each covariate, and remove it from the final model if it falls below a specified level. By default, the significance criterion level is 0.1 (chosen to retain covariates unless they are unlikely to be important). The significance criterion can be changed by using the **-prob** argument:

If you would like to determine the significance of each covariate, but would not like to remove **any** of them from the final model, you can force all the covariates into the final model with the **-all** argument:

You can also selectively force covariates into the final model using the one or more **-f** arguments. Each one is followed by the name of one argument:

When completed, polygenic will provide a summary of the results, which will look something like this:

New with **SOLAR** version 2.0.5 and above, a *Residual Kurtosis* test is performed. The residuals of the trait (after the covariate influences have been removed) are written to a file named **polygenic.residuals** and then the *kurtosis* is computed. If there are no covariates, only the kurtosis of the trait itself is tested. If the Residual Kurtosis exceeds 0.8, you are given a warning. In that case, you are advised to see **Note 5** in the help message for **polygenic** . High kurtosis indicates a trait distribution not well suited for SOLAR linkage analysis because unrealistically high LOD scores might be produced. However, you can mitigate the harmfulness of high kurtosis using tdist or lodadj.

Output files will be written to a subdirectory named by the **trait** or **outdir** commands. They will include a file named **polygenic.out** , which is like the summary shown above. There will also be a file named **polygenic.logs.out** which will show all the loglikelihoods used in the calculations. There will also be many models, including the final model which will be named **poly.mod** . For convenience, a copy of that model named **null0.mod** will be made for use in **twopoint** and **multipoint** linkage scans. The final polygenic model will be left in memory at the conclusion. If screening was done, it will include only the covariates which met the significance criterion, or fixed by the user. The information summary shown above mentions the important files and models written.

### 6.3 Twopoint Scanning

A twopoint scan is done by the **twopoint ** command (itself a **Tcl/Solar** script), which will scan all the markers currently in the **ibddir** . The *LOD score* and **H2q1** (proportion of variance associated with a linkage element) will be reported for each marker. The report is displayed on your terminal and also written to the file **twopoint.out** . If you later get additional markers, you may append them to the previous **twopoint.out** file using the **-append** option.

Twopoint scanning requires a model named **null0** in the output directory. The usual way to create this is with the **polygenic** command. If you are appending new markers to an existing **twopoint.out** , you must be sure you are using the same null model. If not, you should use the **-overwrite** option instead of the **-append** option, and scan the markers all over again.

Twopoint scanning can be done with custom parameterizations. See Section 9.5 for a discussion. One special case of this is the analysis of *dominance*, which is described in Section 9.4.

### 6.4 Multipoint Scanning

The **multipoint ** command will scan the selected chromosomes (or the whole genome) for the highest scoring *quantitative trait locus* (* QTL*). If a

*criterion*LOD score is specified, and that criterion is met by the highest LOD score,

**multipoint**will lock-in the highest QTL and perform another scan to find another QTL. An example was given in Chapter 3. If no criterion is specified,

**multipoint**will simply make one pass to find the best single QTL and stop.

Multipoint requires a model named **null0** in the output directory and the usual way to create this is with the **polygenic** command. It also requires MIBD files (whose preparation is described in Section 5.3.

Before giving the multipoint command, the following other commands are also required:

The following commands are OPTIONAL:

Here is another example multipoint scan:

In this example, multipoint will perform one pass, and then if the LOD for the best QTL is greater than 3, it will make another pass. At the end of the second pass, it will continue making passes until the LOD for the best QTL does not reach 1.9 or higher.

As it is proceeding, **multipoint** will show the LOD score and **H2q1** for each locus tested. When it is done, it will show the highest scoring locus for the last pass, and the QTL’s included in the final model.

Summary output is written to a file named **multipoint.out** . This simply summarizes the loci that were scanned in each pass, and the QTL’s included in the final model. The detailed results for each pass are written to a file named **multipoint**

**.out** , for example, **null1** is model with the best QTL from pass 1, **null2** is the model with the best QTL from pass 2 along with the best QTL from pass 1, and so on.

### 6.6 Bayesian Model Averaging

The ** Bayesian Model Averaging** method identifies important elements (either linkage elements or covariates) by computing a statistic called the

*Bayesian Inference Criterion*(

*) for a model containing each possible*

**BIC***combination*(or

*set*) of those elements, and then performing additional statistical tests on the models having the highest BICs and the elements they contain. The

**SOLAR**command which uses this method is

**bayesavg**.

**bayesavg** has been modified recently. Rather than computing the effective **log(n)** by maximizing a fully *saturated* model with all the candidate elements, an estimated **log(n)** derived from the number of elements is used. Later, after the window of important models has been identified, an exact **log(n)** is computed from the model with the best BIC. Not only is this approach more accurate, it bypasses the step having to maximize a fully saturated model. It does, however, require rewriting the main output file to correct the BIC values. During scanning, the main output file is **bayesavg.est** (or **bayesavg_cov.est** for covariate analysis). Later this file is copied to a file named **bayesavg.nose** (or **bayesavg_cov.nose** ) which uses the final **log(n)** but does not have standard error estimates. Finally the file **bayesavg.out** (or **bayesavg_cov.out** ) is written which has standard error estimates for all the models in the window (unless the **-nose** option has been specified, in which case the **.nose** file is simply renamed to **.out** ). A file named **bayesavg.avg** (or **bayesavg_cov.avg** ) has the final bayesian model averages.

#### 6.6.1 Bayesian Linkage Model Averaging

One way of starting **bayesavg** assumes that you already have created a model with all the elements in it. This would be most likely in the case of linkage elements. After having run a *multipoint scan* finding several important linkage elements, you could immediately procede to **bayesavg** . The default mode is linkage model averaging:

#### 6.6.2 Bayesian Model Averaging with Covariates

It’s also possible to use **bayesavg** with a starting model containing a slew of covariates. However, there are also alternatives where you start with a base model containing only *fixed* features, to which you want to add a set of *variable* features. Perhaps the most useful option for this is the **-list** option, where you specify a file containing all the covariates (or linkage element) you want tested. The file contains the name of each item on a separate line. When you specify a list, the covariates already in the model become fixed elements.

#### 6.6.3 QTN Analysis

**SOLAR QTN** analysis is based on bayesian covariate model averaging. Each candidate single nucleotide polymorphism (

*) is coded as a covariate in a phenotypes file. The*

**snp****-qtn**option has all the features suitable this application. It automatically invokes the

**-cov**and

**-stop**options. The

**-stop**option stops the analysis when no models of current size (that is, the

*degrees of freedom*) have been included in the window. When this happens, it is impossible for any models of larger size to be included in the model either. (This

*option*will probably become the

*default*in a future release.) This is very important for QTN analysis because otherwise the set of all possible models could become astronomically huge. Also the

**-qtn**option includes all covariates with the prefix

**snp_**as variable elements, and all the others as fixed elements. Thus, you can add all the regular covariates and the snps to the starting model (which need not be maximized). Also a special

*is produced named*

**windowfile****bayesavg_cov.win**which contains a matrix of the snps included in each window model.

Before using **bayesavg** on a large set of snps, screening should be done using **qtnm ** and **plotqtn ** to eliminate redundant snps. Once again, this is done to help prevent the number of models that need to be tested from becoming astronomically huge.

When there are going to be more than 20 snps tested (which would correspond to 1,048,575 models with 1 or more elements in them) it is adviseable to use the **-max** option, because the amount of storage necessary to keep all the possible combinations in computer memory (RAM) will start becoming a significant problem. SOLAR might crash when starting the analysis if your computer does not have sufficient memory. Generally, it will not be necessary to run through all the possible degrees of freedom anyway. In most cases, a **-max 10** or less is sufficient. (Use a progressively smaller number as the number of snps increases.) When using **-max** , however, be sure that the **BIC** started to decline for the last degree of freedom actually tested, otherwise a larger **-max** may be required.

### 6.7 Custom Quantitative Genetic Analysis

If you are using the standard **SOLAR** parameterization, but the **polygenic** script does not perform exactly the *quantitative genetic* analysis you require, you can probably do it at a lower level by using the **maximize** command to maximize models you have set up exactly as you want using the **parameter** , **constraint** , **omega** , and **mu** commands.

This is less necessary now than it has been in the past, because now **polygenic ** can handle constrained covariate beta parameters, which previously was one of its more glaring limitations. (See **Note 4** in the documantation for **polygenic** for further details.)

The direct use of the **maximize** command was introduced in Section 3.5 of the Tutorial (Chapter 3). If you need to use a *custom parameterization* that is discussed in Section 9.5.

Similar ideas can be employed to do a custom *linkage* analysis.

Underlying all modeling commands in **SOLAR** is the **maximize** command, which will maximize the loglikehood of the model by adjusting the values of the parameters such as the variance components and covariate betas. In addition, the standard errors for each parameter will be computed, though this is a time consuming option which may be switched off with the **standerr** option:

If there are covariates, it is usually best to maximize a *sporadic* model (having no genetic elements) first because the determination of covariate betas can then be done analytically. Then you can maximize the corresponding *polygenic* model. If you are using the standard parameters, you can use the **spormod** to set up the variance parameters constraints, and omega as required to make a model sporadic, and **polymod** to set up the parameters, constraints, and omega as required to make a model polygenic. You can call these commands after one another, in either order, and as many times as you like (at least that was the design intent). There is also a **linkmod** command which sets up the parameters, constraints, and omega as required for a *linkage* model. It is similar to the other commands, but it requires that a polygenic model has been set up first. After **linkmod** , you could go back to a polygenic model with the **polymod** command.

If you have been working with other models in the same session, it is safest to wipe the slate clean with the **model new** command.

Illustrated above is what the polygenic command does in its most minimal form, without running any extra models to compute statistics, etc. The most important thing is that a **null0** hass been created to use as the *null* in linkage analysis.

To make it easy to compare different models, SOLAR has commands such as **loglike** (return the natural log likelihood of the recently maximized model), **chi** (compute the probability of rejecting the null hypothesis given a chi-square value) and **clod** (compute the lod score given two log likelihoods). The following example will compute the p for the covariate age in a polygenic model (similar to what is done by the **polygenic -screen** command). Notice that chi here returns a string such as “p = 0.0153740”. The **Tcl** **set** command is used to save intermediate results in variables and the Tcl **expr** command is used to evaluate mathematical expressions.

In the next example, the p of the H2r in a polygenic model will be computed similar to the way it is done by the **polygenic** command. In this case, we will use the **chi** command to return a number so that it can be divided by two (which is done because H2r is fixed to a boundary).

After a model has been maximized, all the parameter values in memory are updated. The current value of a particular parameter can be displayed with the **parameter** command, or the entire model can be displayed with the **model** command. Models can also be saved with the **save model** command, and loaded with the **load model** command.

### 6.8 Techniques for Resolving Convergence Failure

The optimization algorithms used by SOLAR are very flexible and robust, and do not necessarily need to have trait values following a *Multivariate Normal Distribution* (though that is optimal), and likewise for covariate values, (and one would expect SOLAR to operate best when all data is normally distributed and there is plenty of it). SOLAR includes a number of *heuristics* based on many years of experience with data that has not always been entirely adequate. However, despite all this, there is no guarantee of success. SOLAR maximization will sometimes be unable to find the optimal parameter values. This is called *Convergence Failure.*

Convergence Failure happens most frequently with linkage models. In the scan output files produced by the **multipoint** command, such as **multipoint1.out** , there is a final column (with no header) which gives an error abbreviation, if applicable. This is where you may see the abbreviation **ConvrgErr.** Convergence errors, like lousy weather, should not stop SOLAR from scanning the genome, at least once. If convergence errors have occurred during one scan, however, additional scans conditioned on that scan will not be performed, and a warning message will be displayed. When you plot the results, points having convergence errors will be shown having an asterisk symbol. (The symbol is placed at an interpolated position on the LOD curve, however you must remember that the LOD of any points where convergence failure occurred is actually unknown. Unfortunately, convergence failure is probably more likely to occur at *interesting* points such as peaks having the highest LOD score. However, sometimes convergence failure occurs right next to zero LOD scores on both sides. Such convergence failures can probably be ignored since even if the convergence failures are hiding single point peaks, those peaks can probably be ignored since they probably reflect random noise rather than an underlying biological process. So, the first thing one can do about SOME convergence errors is: **(1) When convergence errors occur at unimportant QTL’s (surrounded by zero LOD’s on both sides), they can probably be ignored**.

##### 6.8.1 Get *Better Data*

When possible, **(2) one of the most effective way to solve convergence failures is with better data.**

*Better*data can be better in many ways. It can simply be a larger data set. It can also be a data set in which

*outliers*have been removed. SOLAR does not automatically remove outliers. It can also be a better

*cleaned*data set, in which Mendelian errors have been removed. The cleaning of genetic data should be considered a part of every project based on genetic data. Statistical genetics programs, SOLAR included, are very sensitive to and can be thrown off by even a small number of errors. Here at

**SFBR**, data cleaning is a major process involving many people and some staff are dedicated to it. SOLAR already checks for mendelian errors, but it does not necessarily detect all errors. Another program which is very useful for finding data errors is

**Simwalk**.

##### 6.8.2 Transform Data

**Often it is possible to avoid convergence failure by transforming the data in some way.** The ideal data for SOLAR would have a *multivariate normal distribution*, with a mean of zero and a standard deviation of 1.0. Usually larger values for the standard deviation are OK but frequently smaller values cause problems. In particular, a trait standard deviation smaller than 0.5 has often been found to cause problems so SOLAR now has a warning for that. One simple but often useful change in some cases is simply to multiply the trait values by some constant to overcome a small standard deviation. Other traditional transformations, such as taking the natural log or square root of each value, may also be useful. Typically one *looks* at a *histogram* of the data and/or gets some statistics such as standard deviation and *kurtosis*. (SOLAR has a command **stats ** to get the statistics, including kurtosis, for any variable.) Then one decides upon a suitable transformation. Kurtosis less than 0.8 (where zero is defined as the value for a standard normal distribution) is highly desireable. Users of **Pedsys** can use the **tally** program. Other useable programs include **Gauss** and **Excel**. The transformation of data prior to statistical analysis is discussed in many textbooks, including *Genetics and Analysis of Quantitative Traits* by Lynch and Walsh (1998, Sinauer Associates, Inc.).

**NOTE: The data transformations described below are intended for use with quantitative traits only.** The things you can do with discrete traits are limited and not technically rigorous. However, sometimes you can do exploratory research by pretending your discrete trait is a poorly distributed quantitative trait. That can be accomplished by giving the command **option EnableDiscrete 0** before running polygenic.

###### 6.8.2.1 zscore

To use the zscore command, give the command **zscore** after specifying your trait. Then run polygenic and multipoint as usual. Be sure to read the documentation for **zscore **.

###### 6.8.2.2 Inverse Normalization

Inverse normalization is an important new feature added to **SOLAR** version 4. When a variable (trait or covariate) is inverse normalized, it resembles a normally distributed variable, yet retains much of the original information as far as a linkage analysis is concerned. As with zscore, the variable will have a mean near 0.0 and a standard deviation near 1.0, but it will also have a distribution that approximates normal.

**SOLAR** provides two ways to inverse normalize. The most convenient is to inverse normalize on-the-fly with the **define ** command, for example:

The prefix “inormal_” (and the underscore is mandatory) causes the prefixed variable to inverse normalized during maximization.

The other way to inverse normalize is using the **inormal ** command, which allows you to create a new file with the inormalized variable for use with **SOLAR** or other programs.

###### 6.8.2.3 Residualization

Another useful technique is to *residualize* your trait variable to include all covariate effects. This technique is especially called for when you can run a full linkage analysis on the trait by itself, but adding important covariates leads to convergence failure. When you residualize your trait, you are implicitly fixing a set of covariate betas, and you are simplifying the linkage problem by only having one variable to maximize.

To do this, you start the analysis by doing a polygenic analysis of your trait and covariates. After the polygenic command, run the **residual ** command. This will produce a file named **residual.out** with a residualized trait now called **residual** . Now load the file residual.out as a phenotypes file, and select the variable **residual** as your trait. Now you do not need to select any covariate, as all covariate effects are already incorporated into the residual value, at least as they appeared in the polygenic model.

maximize command. The initial boundaries are based on the maximum and minimum values of the covariate variable and the trait. The covariate boundaries are initially set so that the covariate variable could explain twice of the variation of the trait variable. (This default ratio is controlled by the **option AutoCovarBound** .) If the covariate boundaries are too narrow, they are automatically increased by **covboundincr** each time for a maximum of **covboundretries** . You can adjust **option AutoCovarBound** using the **option** command, and adjust the other factors using the **covboundincr** and **covboundretries** commands. But generally you will not need to mess with these (I have never known anyone to try), unless you are having trouble with your *sporadic* or *polygenic* models.

###### 6.8.3.2 Manage Variance Component Boundaries and Starting Points

SOLAR already has fairly complicated automatic management of variance component parameter boundaries based on our experience. However, it is not necessarily perfect, and may require some adjustment to handle the most difficult convergence situations. The basic command to use is **boundary** , and you should read the documentation for the **boundary** command as well as the documentation under **boundary-notes** . You can click the links above in this paragraph, browse the documentation later, or use the **help** command to bring up the documentation.

The standard variance component parameters of SOLAR (such as **e2** and **h2r** have *natural boundaries* of 0 and 1. But allowing the parameters this entire range frequently causes convergence failure, so *artificial boundaries* for each parameter are positioned and moved within that range. At the “end” of the Fortran maximization algorithm, if any of the variance component parameters has hit its boundary, the hit boundary is moved and then the maximization algorithm is called again. On the other hand, if there is a convergence failure, boundaries around all the variance components are pulled-in (*crunched*), and maximization is tried again. If boundaries have already been crunched and there is still a convergence failure, a second level of more intense crunching is done. During **multipoint** scanning, there are additional types of ** retries** which may be tried. Ultimately the effect of all retries will either be a successful convergence, which is what usually happens, or a convergence failure.

Prior to running **multipoint** , you can give **boundary** commands to alter the way the automatic boundary management is done. For example, the **boundary wide start** can be given to use the wide *natural* boundaries whenever possible (unless an actual convergence failure occurs anyway). For more examples and discussion, see the documentation for the boundary command.

##### 6.8.5 Conv, Tol, and other convergence settings

The Fortran maximization algorithm **Search.f** also has a number of convergence settings which can be controlled through the **option** command. Generally most users should not be touching these settings. Improper adjustment of the convergence settings could lead to invalid results.

There are four basic settings available named after the Fortran variables they control: **MaxStep, Conv, NConv,** and **Tol** . It has proven useful (starting with version 2.0.4) to use a larger value for Conv for discrete models, so there is now a separate option for **Conv(Discrete)** . There are also two new options to deal with discrete models: **BCliff** and **MaxCliffs** .

Since options are **model dependent**, you should specify or change an option AFTER giving the **trait** command, but before giving a maximization command such as **polygenic** . If you load a model, all option values are taken from the model or use the default otherwise. If you give the **model new** command, all options are set to the default values.

**MaxStep** (which is called MXSTEP in Search.f) determines the number of step decrements which are allowed in each iteration. At the beginning of each iteration, Search.f tries to predict the optimal point by solving the quadratic equation. However, often there is more curvature in the likelihood function than expected, and the predicted point goes over a cliff where likelihood gets worse rather than better. When this happens, Search.f backs up, only going a fraction of the distance in the direction of the predicted optimal point. Once again, that can fail. The number of times SOLAR is allowed to “back up” is limited by **MaxStep** . Sometimes it helps to increase this value. You can see the number of steps that are taken in each iteration in the maximization output files or in the verbose maximization output.

**Conv** and **NConv** work together to determine when maximization has completed. When the loglikelihood has changed by less than **Conv** for **NConv** iterations, maximization is deemed complete. It sometimes helps to make Conv larger to allow maximization to complete when the likelihood surface is extremely curved. This is similar to increasing the tolerances for parameters, which is addressed by the **Tol** setting.

Maximization involves computing the loglikehood of one set of parameter values and then trying the find a set of parameter values that has an even higher loglikelihood. Unfortunately, sometimes a test point of parameter values will lead to an incomputable loglikelihood. Such an incomputable value, like division by zero, is called **Not A Number** or **NaN** . The code directing discrete trait maximizations, **Optima.f** had a tendency to ignore these problems under some circumstances. So an addtional mechanism has been introduced which handles this problem in the same way that **Search.f** backs up from unexpectedly large (but not incomputable) drops in loglikelihood value. The bad “point” is considered to be “over the cliff” so we back up from it. Since the situation has apparently gotten very bad, we back may need to back up more than usual, so there is a new **MaxCliffs** setting which takes the place of the smaller **MaxStep** setting. There is also a factor, **BCliff** which determines how “much” to back up. The default value, **0.1** means that each time we back up from a cliff, we back up by a factor of 10. This is the same (not adjustable) value that is used under “ordinary” circumstances where the likelihood has fallen by a very large but still computable amount.

##### 6.8.6 Enforce Constraints

Sometimes a multipoint run will terminate with a “Constraints not Satisfied” error. This seems very strange at first, because your model started with satisfied constraints. Unfortunately, during maximization there can be numerical problems (such as when very small numbers are added to very large numbers) such that, at the end of a maximization step the constraints are no longer satisfied. This is best understood as a kind of convergence error resulting from numerical limits. There is an EXPERIMENTAL option intended to correct this problem. You can try the command **option EnforceConstraints 1** before running **polygenic** . Unfortunately, in our experience, this simply leads to other kinds of convergence errors later on. Residualization as described above generally works better, if you can use it.

### 6.9 Handling Poor Trait Distributions

**SOLAR** has two mechanisms to deal with trait distributions that are significantly different from *multivariate normality*. These are also applicable when the *Residual Kurtosis* (calculated by the **polygenic** command) is greater than 0.8. These mechanisms are embodied in the commands **tdist** and **lodadj** . Using either of these commands will result in more accurate LOD scores when the trait distribution is not ideal.

#### 6.9.1 The **tdist** command

The **tdist ** command sets up an extra parameter, **t_param** , and enables it by setting the **option tdist** to 1. All you need to do is give the **tdist** command prior to running **polygenic** or another maximization command:

Although **tdist** is quick and easy to use, the extra parameter will slow down maximization somewhat, which will slow down genome scans. To turn if off, give the **tdist -off** command.

#### 6.9.2 Empirical LOD Adjustment ( **lodadj** )

Empirical LOD Adjustment, as realized by the **lodadj ** command, is one of the most important features of SOLAR. It corrects for the innacuracy caused by non-normal trait distribution, but unlike **tdist** it does not make maximization slower or less robust.

To use LOD adjustment, you must first calculate the adjustment required by simulation. This is done by the **lodadj -calc** command, and it may require a considerable amount of time to complete. Ultimately you end up with a factor less than 1.0 (such as 0.9) by which all your LOD scores are (automatically) multiplied. After the LOD adjustment is computed, you simply need to turn it on when desired with the **lodadj** command and off with the **lodadj -off** command.

There is a full discussion of **lodadj** in Section 10.2.

### 6.10 Advanced Modeling Topics

Advanced modeling topics including discrete traits, bivariate (two traits), dominance, household groups, and arbitrary parameterization are now discussed in Chapter 9.

Chapter 6 6.1 Kinds of Modeling Commands SOLAR has four major high level genetic modeling commands: These commands (which are actually SOLAR/Tcl scripts) create, maximize, and compare

## user info

- Cartesius
- Lisa
- Description
- Obtaining an account
- Getting started
- User guide
- Software
- FAQ

- Custom Cloud Solutions
- Data Archive
- Data Repository
- EPIC PID
- Research Drive
- DP – Grid
- DP – Spider
- HPC Cloud
- Collaboratorium
- Visualization
- System status
- General info

Please don’t hesitate to contact us via

## SOLAR – Sequential Oligogenic Linkage Analysis Routines

SOLAR is an extensive, flexible software package for genetic variance components analysis, including linkage analysis, quantitative genetic analysis, SNP association analysis (QTN, QTLD, and MGA), and covariate screening. Operations are included for calculation of marker-specific or multipoint identity-by-descent (IBD) matrices in pedigrees of arbitrary size and complexity, and for linkage analysis of multiple quantitative traits and/or discrete traits which may involve multiple loci (oligogenic analysis), dominance effects, household effects, and interactions.

SOLAR is available using the command:

### Cartesius

Cartesius is the Dutch national supercomputer. Cartesius is a general purpose capability system and is designed to be a well balanced system.

The Lisa Cluster is meant for people who are in need of large computing capacities, but do not need the facilities of a real supercomputer.

### Data Archive

The SURFsara Data Archive allows the user to safely archive up to petabytes of valuable research data.

### EPIC-PID

Persistent identifiers (PIDs) ensure the findability of your data. SURFsara offers a PID provisioning service in cooperation with the European Persistent Identifier Consortium (EPIC).

### B2SAFE

B2SAFE is a robust, secure and accessible data management service. It allows common repositories to reliably implement data management policies, even in multiple administrative domains.

### Data processing – Grid

The grid is a transnational distributed infrastructure of compute clusters and storage systems. SURFsara is active as partner in various.

### Data processing – Spider

Spider is a dynamic, flexible, and customizable platform locally hosted at SURF. Optimized for collaboration, it is supported by an ecosystem of tools to allow for data-intensive projects that you can start up quickly and easily.

### HPC Cloud

With the HPC Cloud facility, SURFsara offers self-service, dynamically scalable and fully configurable HPC systems to the Dutch academic community.

### Data Ingest

The Data Ingest Service is a service provided by SURFsara for users that want to upload a large amount of data to SURFsara and who not have the sufficient amount.

### Collaboratorium

The Collaboratorium is a visualization and presentation space for science and industry. The facility is of great use for researchers that are faced with.

### Visualization

Data visualization can play an important role in research, specifically in data analysis to complement other analysis methods, such as statistical analysis.

user info Cartesius Lisa Description Obtaining an account Getting started User guide Software FAQ Custom Cloud Solutions Data Archive