 |
Index for Section 5 |
|
 |
Alphabetical listing for N |
|
 |
Bottom of page |
|
NEWSFEEDS(5)
NAME
newsfeeds - determine where Usenet articles get sent
DESCRIPTION
The file <pathetc in inn.conf>/newsfeeds specifies how incoming articles
should be distributed to other sites. It is parsed by the InterNetNews
server innd(8) when it starts up, or when directed to by ctlinnd(8).
The file is interpreted as a set of lines according to the following rules.
If a line ends with a backslash, then the backslash, the newline, and any
whitespace at the start of the next line is deleted. This is repeated
until the entire ``logical'' line is collected. If the logical line is
blank, or starts with a number sign (``#''), it is ignored.
All other lines are interpreted as feed entries. An entry should consist
of four colon-separated fields; two of the fields may have optional sub-
fields, marked off by a slash. Fields or sub-fields that take multiple
parameters should be separated by a comma. Extra whitespace can cause
problems. Except for the site names, case is significant. The format of
an entry is:
sitename[/exclude,exclude,...]\
:pattern,pattern...[/distrib,distrib...]\
:flag,flag...\
:param
Each field is described below.
The sitename is the name of the site to which a news article can be sent.
It is used for writing log entries and for determining if an article should
be forwarded to a site. If sitename already appears in the article's Path
header, then the article will not be sent to the site. The name is usually
whatever the remote site uses to identify itself in the Path line, but can
be almost any word that makes sense; special local entries (such as
archivers or gateways) should probably end with an exclamation point to
make sure that they do not have the same name as any real site. For
example, ``gateway'' is an obvious name for the local entry that forwards
articles out to a mailing list. If a site with the name ``gateway'' posts
an article, when the local site receives the article it will see the name
in the Path and not send the article to its own ``gateway'' entry. See
also the description of the ``Ap'' flag, below. If an entry has an
exclusion sub-field, then the article will not be sent to that site if any
of the names specified as excludes appear in the Path header. The same
sitename can be used more than once - the appropriate action will be taken
for each entry that should receive the article, regardless of the name -
although this is recommended only for program feeds to avoid confusion.
Case is not significant in site names.
The patterns specify which groups to send to the site and are interpreted
to build a ``subscription list'' for the site. The default subscription is
to get all groups. The patterns in the field are wildmat(3)-style
patterns, and are matched in order against the list of newsgroups that the
local site receives. If the first character of a pattern is an exclamation
mark, then any groups matching the pattern are removed from the
subscription, otherwise any matching groups are added. For example, to
receive all ``comp'' groups, but only comp.sources.unix within the sources
newsgroups, the following set of patterns can be used:
comp.*,!comp.sources.*,comp.sources.unix
There are three things to note about this example. The first is that the
trailing ``.*'' is required. The second is that, again, the result of the
last match is the most important. The third is that ``comp.sources.*''
could be written as ``comp.sources*'' but this would not have the same
effect if there were a ``comp.sources-only'' group.
There is also a way to subscribe to a newsgroup negatively. That is to
say, do not send this group even if the article is cross-posted to a
subscribed newsgroup. If the first character of a pattern is an atsign
``@'', it means that any article posted to a group matching the pattern
will not be sent even though the article may be cross-posted to a group
which is subscribed. The same rules of precedence apply in that the last
match is the one which counts. For example, if you want to prevent all
articles posted to any "alt.binaries.warez" group from being propagated
even if it is cross-posted to another "alt" group or any other group for
that matter, then the following set of patterns can be used:
alt.*,@alt.binaries.warez.*,misc.*
If you reverse the alt.* and alt.binaries.warez.* patterns, it would
nullify the atsign because the result of the last match is the one that
counts. Using the above example, if an article is posted to one or more of
the alt.binaries.warez.* groups and is cross-posted to misc.test, then the
article is not sent.
See innd(8) for details on the propagation of control messages.
A subscription can be further modified by specifying ``distributions'' that
the site should or should not receive. The default is to send all articles
to all sites that subscribe to any of the groups where it has been posted ,
but if an article has a Distribution header and any distribs are specified,
then they are checked according to the following rules:
1. If the Distribution header matches any of the values in the sub-field,
then the article is sent.
2. If a distrib starts with an exclamation point, and it matches the
Distribution header, then the article is not sent.
3. If Distribution header does not match any distrib in the site's entry,
and no negations were used, then the article is not sent.
4. If Distribution header does not match any distrib in the site's entry,
and any distrib started with an exclamation point, then the article is
sent.
If an article has more than one distribution specified, then each one is
according to the above rules. If any of the specified distributions
indicate that the article should be sent, it is; if none do, it is not sent
- the rules are used as a ``logical or.'' It is almost definitely a mistake
to have a single feed that specifies distributions that start with an
exclamation point along with some that don't.
Distributions are text words, not patterns; entries like ``*'' or ``all''
have no special meaning.
The flags parameter specifies miscellaneous parameters. They may be
specified in any order; flags that take values should have the value
immediately after the flag letter with no whitespace. The valid flags are:
<size
An article will only be sent to the site if it is less than size bytes
long. The default is no limit.
>size
An article will only be sent to the site if it is greater than size
bytes long. The default is no limit.
Achecks
An article will only be sent to the site if it meets the requirements
specified in the checks, which should be chosen from the following
set:
c Exclude all kinds of control messages
C Only include all kinds of control messages
d Distribution header required
e Only include message whose newsgroups in
Newsgroups header all exist in active
o Overview data is created
O Propagate articles without X-Trace header
even if ``O'' flag is specified
p Do not check Path header for the sitename
before propagating (the exclusions are
still checked).
If both ``c'' and ``C'' are specified simultaneously, the last
specified one is adopted.
Bhigh/low
If a site is being fed by a file, channel, or exploder (see below),
the server will normally start trying to write the information as soon
as possible. Providing a buffer may give better system performance
and help smooth out overall load if a large batch of news comes in.
The value of the this flag should be two numbers separated by a slash.
The first specifies the point at which the server can start draining
the feed's I/O buffer, and the second specifies when to stop writing
and begin buffering again; the units are bytes. The default is to do
no buffering, sending output as soon as it is possible to do so.
Ccount
If this flag is specified, an article will only be sent to the site if
the following is true: The number of groups it is posted to, plus the
square of the number of groups followups would appear in, is no more
than count newsgroups. 30 is a good value for this flag.
Fname
This flag specifies the name of the file that should be used if it is
necessary to begin spooling for the site (see below). If name is not
an absolute pathname, it is taken to be relative to
<pathoutgoing in inn.conf>. Then, if the destination is a directory,
the file togo in that directory will be used as filename.
Gcount
If this flag is specified, an article will only be sent to the site if
it is posted to no more than count newsgroups. This has the problem
of filtering out many FAQs, and also RFDs/CFVs for group creation. The
C or U flags are a better solution.
Hcount
If this flag is specified, an article will only be sent to the site if
it has count or fewer sites in its Path line. This flag should only
be used as a rough guide because of the loose interpretation of the
Path header; some sites put the poster's name in the header, and some
sites that might logically be considered to be one hop become two
because they put the posting workstation's name in the header. The
default value for count is one.
Isize
The flag specifies the size of the internal buffer for a file feed.
If there are more file feeds than allowed by the system, they will be
buffered internally in least-recently-used order. If the internal
buffer grows bigger then size bytes, however, the data will be written
out to the appropriate file. The default value is <SITE_BUFFER_SIZE
in include/config.h> (typically (16 * 1024 )) bytes.
Nmodifiers
The newsgroups that a site receives are modified according to the
modifiers, which should be chosen from the following set:
m Only moderated groups
u Only unmoderated groups
OOriginator
If this flag is used then articles sent to this feed must contain a
X-Trace header and the first field in the header must match
originator. Originator can be a list of wildmat(3)-style pattern.
The list is separated by ``/''. Article is never sent, if the first
character of the pattern begins with ``@'' and rest of the pattern
matches. One use of this flag is to restrict the feed to locally
generated posts.
Ppriority
The nice priority that this channel or program feed should receive.
This should be a positive number between 0 and 20, and is the priority
that the new process will run with. This flag can be used to raise
the priority to normal if you are using the ``nicekids'' in
inn.conf(5).
Ssize
If the amount of data queued for the site gets to be larger than size
bytes, then the server will switch to spooling, appending to a file
specified by the ``F'' flag, or <pathoutgoing in inn.conf>/sitename if
the ``F'' flag is not specified. Spooling usually happens only for
channel or exploder feeds.
Ttype
This flag specifies the type of feed for the site. Type should be a
letter chosen from the following set:
c Channel
f File
l Log entry only
m Funnel (multiple entries feed into one)
p Program
x Exploder
Each feed is described below in the section on feed types. The
default is Tf.
Ucount
If this flag is specified, an article will only be sent to the site if
followups to this article would be posted to no more than count
newsgroups.
Witems
If a site is fed by file, channel, or exploder, this flag controls
what information is written. If a site is fed by a program, only the
asterisk (``*'') has any effect. The items should be chosen from the
following set:
b Size of wire formatted article in bytes
e The time article will be expired as
seconds since epoch
``0'' means there is no ``Expires'' header
f Token of the article (equivalent to ``n'')
g The newsgroup the article is in;
if cross-posted, then the first of the
groups this site gets (the differnece from
``G'' is that article belongs to its
newsgroup even if it is congrol message)
h Article's Message-ID hash key
m Article's Message-ID
n Token of the article
p The time the article was posted as seconds
since epoch
s The site that fed the article to the
server; from the Path: header or the IP
address of the site that sent the article
depending on the ``logipaddr'' field in
inn.conf(8);
if ``logipaddr'' is ``true'' and IP
address is ``0.0.0.0'' which means the
article is feeded from localhost (e.g.
from rnews), the result will be retrieved
from Path: header
t The time article was received as seconds
since epoch
* Names of the appropriate funnel entries;
or all sites that get the article
D Value of the Distribution header; ? if
none present
G Where the article is stored; if cross-
posted, then the first of the groups this
site gets (the differnece from ``g'' is
that cmsg belongs to ``control.cmsg'')
H All headers
N Value of the Newsgroups header
P Article's Path header
O Overview data
R Information needed for replication
More than one letter can be used; the entries will be separated by a
space, and written in the order in which they are specified. The
default is Wn.
The ``H'' and ``O'' items are intended for use by programs that create
news overview databases. If ``H'' is present, then the all the
article's headers are written followed by a blank line. An Xref
header (even if one does not appear in the filed article) and a Bytes
header, specifying the article's size (see ``b'' description below for
clarifying size meaning), will also be part of the headers. If used,
this should be the only item in the list; if preceded by other items,
however, a newline will be written before the headers. The ``nteO''
generates input to the overchan(8) program.
The asterisk has special meaning. It expands to a space-separated
list of all sites that received the current article. If the site is
the target of a funnel however (i.e., it is named by other sites which
have a ``Tm'' flag), then the asterisk expands to the names of the
funnel feeds that received the article. If the site is fed by a
program, then an asterisk in the param field will be expanded into the
list of funnel feeds that received the article. A site fed by a
program cannot get the site list unless it is the target of other
``Tm'' feeds.
The interpretation of the param field depends on the type of feed, and is
explained in more detail below in the section on feed types. It can be
omitted.
The site named ME is special. There must be exactly one such entry, and it
should be the first entry in the file. If the ME entry has an exclusion
sub-field, then the articles are rejected if any of the names specified as
excludes appear in the Path header. If the ME entry has a subscription
list, then that list is automatically prepended to the subscription list of
all other entries. For example, ``*,!control,!junk,!foo.*'' can be used to
set up the initial subscription list for all feeds so that local postings
are not propagated unless ``foo.*'' explicitly appears in the site's
subscription list. Note that most subscriptions should have
``!junk,!control'' in their pattern list; see the discussion of ``control
messages'' in innd(8). (Unlike other news software, it does not affect
what groups are received; that is done by the active(5) file.)
If the ME entry has a distribution subfield, then only articles that match
the distribution list are accepted; all other articles are rejected. A
commercial news server, for example, might have ``/!local'' to reject local
postings from other, misconfigured, sites.
FEED TYPES
Innd provides four basic types of feeds: log, file, program, and channel.
An exploder is a special type of channel. In addition, several entries can
feed into the same feed; these are funnel feeds, that refer to an entry
that is one of the other types. Note that the term ``feed'' is technically
a misnomer, since the server does not transfer articles, but reports that
an article should be sent to the site.
The simplest feed is one that is fed by a log entry. Other than a mention
in the news logfile, <pathlog in inn.conf>/news, no data is ever written
out. This is equivalent to a ``Tf'' entry writing to /dev/null except that
no file is opened.
A site fed by a file is the next simplest type of feed. When the site
should receive an article, one line is written to the file named by the
param field. If param is not an absolute pathname, it is taken to be
relative to <pathoutgoing in inn.conf>. If empty, the filename defaults to
<pathoutgoing in inn.conf>/sitename. This name should be unique.
When a site fed by a file is flushed (see ctlinnd), the following steps are
performed. The script doing the flush should have first renamed the file.
The server tries to write out any buffered data, and then closes the file.
The renamed file is now available for use. The server will then re-open
the original file, which will now get created.
A site fed by a program has a process spawned for every article that the
site receives. The param field must be a sprintf(3) format string that may
have a single %s parameter, which will be a token. Standard input will not
be set to the article. Standard output and error will be set to the error
log ( <pathlogininn.conf>/errlog.) The process will run with the user and
group ID of the <pathrun in inn.conf> directory. Innd will try to avoid
spawning a shell if the command has no shell meta-characters; this feature
can be defeated by appending a semi-colon to the end of the command. The
full pathname of the program to be run must be specified; for security,
PATH environment is not searched.
If the entry is the target of a funnel, and if the ``W*'' flag is used,
then a single asterisk may be used in the param field where it will be
replaced by the names of the sites that fed into the funnel. If the entry
is not a funnel, or if the ``W*'' flag is not used, then the asterisk has
no special meaning.
Flushing a site fed by a program does no action.
When a site is fed by a channel or exploder, the param field names the
process to start. Again, the full pathname of the process must be given.
When the site is to receive an article, the process receives a line on its
standard input telling it about the article. Standard output and error,
and the user and group ID of the all sub-process are set as for a program
feed, above. If the process exits, it will be restarted. If the process
cannot be started, the server will spool input to a file named
<pathoutgoing in inn.conf>/sitename. It will then try to start the process
some time later.
When a site fed by a channel or exploder is flushed, the server closes down
its end of the pipe. Any pending data that has not been written will be
spooled; see the description of the ``S'' flag, above. No signal is sent;
it is up to the program to notice EOF on its standard input and exit. The
server then starts a new process.
Exploders are a superset of channel feeds. In addition to channel
behavior, exploders can be sent command lines. These lines start with an
exclamation point, and their interpretation is up to the exploder. The
following messages are generated automatically by the server:
newgroup group
rmgroup group
flush
flush site
These messages are sent when the ctlinnd command of the same name is
received by the server. In addition, the ``send'' command can be used to
send an arbitrary command line to the exploder child-process. The primary
exploder is buffchan(8).
Funnel feeds provide a way of merging several site entries into a single
output stream. For a site feeding into a funnel, the param field names the
actual entry that does the feeding.
EXAMPLES
## Initial subscription list and our distributions.
ME:*,!junk,!foo.*/world,usa,na,ne,foo,ddn,gnu,inet\
::
## Feed all moderated source postings to an archiver
source-archive!:!*,*sources*,!*wanted*,!*.d\
:Tc,Wn:<PREFIX specified with --prefix at configure>/bin/archive -f -i \
/usr/spool/news.archive/INDEX
## Watch for big postings
watcher!:*\
:Tc,Wbnm\
:exec awk '$1 > 1000000 { print "BIG", $2, $3 }' >/dev/console
## A UUCP feed, where we try to keep the "batching" between 4 and 1K.
ihnp4:/world,usa,na,ddn,gnu\
:Tf,Wnb,B4096/1024:
## Usenet as mail; note ! in funnel name to avoid Path conflicts.
## Can't use ! in "fred" since it would like look a UUCP address.
fred:!*,comp.sources.unix,comp.sources.bugs\
:Tm:mailer!
barney@bar.com:!*,news.software.nntp,comp.sources.bugs\
:Tm:mailer!
mailer!:!*\
:W*,Tp:/usr/ucb/Mail -s "News article" *
## NNTP feeds fed off-line via nntpsend or equivalent.
feed1::Tf,Wnm:feed1.domain.name
peer.foo.com:foo.*:Tf,Wnm:peer.foo.com
## Real-time transmission.
mit.edu:/world,usa,na,ne,ddn,gnu,inet\
:Tc,Wnm:<PREFIX specified with --prefix at configure>/bin/nntplink -i stdin mit.edu
## Two sites feeding into a hypothetical NNTP fan-out program:
nic.near.net:\
:Tm:nntpfunnel1
uunet.uu.net/uunet:!ne.*/world,usa,na,foo,ddn,gnu,inet\
:Tm:nntpfunnel1
nntpfunnel1:!*\
:Tc,Wmn*:<PREFIX specified with --prefix at configure>/bin/nntpfanout
## A UUCP site that wants comp.* and moderated soc groups
uucpsite!comp:!*,comp.*/world,usa,na,gnu\
:Tm:uucpsite
uucpsite!soc:!*,soc.*/world,usa,na,gnu\
:Tm,Nm:uucpsite
uucpsite:!*\
:Tf,Wnb:/usr/spool/batch/uucpsite
The last two sets of entries show how funnel feeds can be used. For
example, the nntpfanout program would receive lines like the following on
its standard input:
<123@litchi.foo.com> comp/sources/unix/888 nic.near.net uunet.uu.net
<124@litchi.foo.com> ne/general/1003 nic.near.net
Since the UUCP funnel is only destined for one site, the asterisk is not
needed and entries like the following will be written into the file:
<qwe#37x@snark.uu.net> comp/society/folklore/3
<123@litchi.foo.com> comp/sources/unix/888
HISTORY
Written by Rich $alz <rsalz@uunet.uu.net> for InterNetNews. This is
revision 1.29.2.1, dated 2000/08/17.
SEE ALSO
active(5), buffchan(8), ctlinnd(8), inn.conf(5), innd(8), wildmat(3).
 |
Index for Section 5 |
|
 |
Alphabetical listing for N |
|
 |
Top of page |
|