Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Paper page - Converting Transformers into DGNNs Form
[go: Go Back, main page]

Papers
arxiv:2502.00585

Converting Transformers into DGNNs Form

Published on Feb 1, 2025
Authors:
,
,
,

Abstract

Converter, a model that transforms Transformers into Directed Graph Neural Networks using synthetic unitary digraph convolution, achieves superior performance with computational efficiency and simplicity across various tasks.

Recent advances in deep learning have established Transformer architectures as the predominant modeling paradigm. Central to the success of Transformers is the self-attention mechanism, which scores the similarity between query and key matrices to modulate a value matrix. This operation bears striking similarities to digraph convolution, prompting an investigation into whether digraph convolution could serve as an alternative to self-attention. In this study, we formalize this concept by introducing a synthetic unitary digraph convolution based on the digraph Fourier transform. The resulting model, which we term Converter, effectively converts a Transformer into a Directed Graph Neural Network (DGNN) form. We have tested Converter on Long-Range Arena benchmark, long document classification, and DNA sequence-based taxonomy classification. Our experimental results demonstrate that Converter achieves superior performance while maintaining computational efficiency and architectural simplicity, which establishes it as a lightweight yet powerful Transformer variant.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2502.00585
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2502.00585 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2502.00585 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2502.00585 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.