match_genes package:misc R Documentation Description: The function finds one possible set of control genes, defined as genes with different biological function (you can code this however you like), to a given set of genes of interest. The control genes will be defined based on an optimal distance to the genes of interest. Usage: match_genes(all, interest, ...) Arguments: all: data.frame containing all genes that could possibly be a control gene. interest: data.frame containing the genes of interest. opt: the optimal distance (i.e., the distance from the genes of interest to which the control genes must be close to). Default: opt = 10^4 graph: boolean indicating whether a graph with the distances must be produced or not. Default: graph = F list: boolean indicating which kind of outcome should be returned: just the data.frame with the matches (list=F) or a list with the data.frame and the median match distance (list=T). Default: list = F Details: Both the all data.frame and the interest data.frame must contain five columns (gene ID, chromosome, start position, end position and the function). Note that the last column may be treated as a filter, which can be biological function, gene ontology, etc. Value: If list=T: A list of two elements is returned. The first element is the match data.frame with each interest gene matched to a control gene and the relative distance between these two. The second is the median of the relative distances, which is used by another function. If list=F: The match data.frame is returned, which comprises of each interest gene matched to a control gene and the relative distance between these two. Author(s): Murillo Fernando Rodrigues References: None. Examples: #Running match_genes function with both needed data.frames and default parameters. The data.frames used here are available for testing. all = read.csv ("all_genes.csv", header=T) interest = read.csv ("genes_of_interest.csv", header=T) match_genes(all, interest, list = T)