## Abstract

Maximal biclique (also known as complete bipartite) subgraphs can model many applications in Web mining, business, and bioinformatics. Enumerating maximal biclique subgraphs from a graph is a computationally challenging problem, as the size of the output can become exponentially large with respect to the vertex number when the graph grows. In this paper, we efficiently enumerate them through the use of closed patterns of the adjacency matrix of the graph. For an undirected graph G without self-loops, we prove that 1) the number of closed patterns in the adjacency matrix of G is even, 2) the number of the closed patterns is precisely double the number of maximal biclique subgraphs of G, and 3) for every maximal biclique subgraph, there always exists a unique pair of closed patterns that matches the two vertex sets of the subgraph. Therefore, the problem of enumerating maximal bicliques can be solved by using efficient algorithms for mining closed patterns, which are algorithms extensively studied in the data mining field. However, this direct use of existing algorithms causes a duplicated enumeration. To achieve high efficiency, we propose an O(mn) time delay algorithm for a nonduplicated enumeration, in particular, for enumerating those maximal bicliques with a large size, where m and n are the number of edges and vertices of the graph, respectively. We evaluate the high efficiency of our algorithm by comparing it to stateof-the-art algorithms on three categories of graphs: randomly generated graphs, benchmarks, and a real-life protein interaction network. In this paper, we also prove that if self-loops are allowed in a graph, then the number of closed patterns in the adjacency matrix is not necessarily even, but the maximal bicliques are exactly the same as those of the graph after removing all the self-loops.

Original language | English (US) |
---|---|

Pages (from-to) | 1625-1636 |

Number of pages | 12 |

Journal | IEEE Transactions on Knowledge and Data Engineering |

Volume | 19 |

Issue number | 12 |

DOIs | |

State | Published - Dec 2007 |

## Keywords

- Bioinformatics (genome or protein) database
- Closed patterns
- Maximal biclique subgraphs
- Mining methods and algorithms

## ASJC Scopus subject areas

- Information Systems
- Computer Science Applications
- Computational Theory and Mathematics