Messages from 17175

Article: 17175
Subject: Programming Xilinx without Foundation
From: mel@cix...co...uk (mel)
Date: Wed, 7 Jul 1999 11:23 +0100 (BST)
Links: << >> << T >> << A >>

For development I'm quite happy programming the Xilinx 9536 using the 
Foundation software. However, now that I have two finished designs, I'm 
finding that the production department are confused by the Foundation 
software (they're quite happy wielding soldering irons about but they're 
not very PC literate).

So, I'm wondering if there's an easy way to embed the jedec file and 
programming algorithm into a DOS program. This would provide the simplest 
possible way of programming an on-board device via the PC printer port 
programmer.

--

    /Mel/ (at work)

Article: 17176
Subject: Re: Altera EPC1 replacement?
From: Lior Dvir - Telrad LTD <lior@tibam.elex.co.il>
Date: Wed, 7 Jul 1999 12:59:04 GMT
Links: << >> << T >> << A >>


Look at Atmel. They have EEPROMs compatible with Altera parts.
www.atmel.com
--------------------------------


On Wed, 16 Jun 1999, Peter S=F8rensen wrote:

> I looked too and found nothing.
> There no standard serial EPROMs that big.
> I suppose you know of the EPC2 which can not fit into the 8 dip package.
>=20
> Hi Peter
>=20
> Garrick Kremesec wrote:
>=20
> > Hello,
> >
> >    This was brought up recently, but are there any erasable replacement=
s
> > for the 8 pin dip Altera EPC1?  I'm really looking for something that i=
s
> > pin/function equivalent yet either electronically erasable or UV
> > erasable.
> >
> > Thank you for the help.
> >
> > Garrick Kremesec
> > University of Illinois
> > gkremese@ews.uiuc.edu
>=20
>=20
>=20

Article: 17177
Subject: Re: Floating point on fpga, Counters?
From: "Trevor Landon" <landont@ttc.com>
Date: Wed, 7 Jul 1999 09:58:49 -0400
Links: << >> << T >> << A >>

This is very similar to the approach that I first thought of.  The main
problem with it is the need for the barrel shifter to normalize the output.
(and the associated cost of implimenting it in an FPGA)

I guess heart of my question was whether there was a cleaver algorithm to
simplify this problem.  (Perhaps keeping a full sized counter in the
background, and adding/single bit shifting the mantissa as needed)

Thanks,
Trevor Landon

Ray Andraka <randraka@ids.net> wrote in message
news:37828FFE.9BCE92F7@ids.net...
> I think the counter has to be fixed point with as many bits as is required
to
> represent the maximum count desired.  If it were floating point, you'd
need a
> complementary counter underneath it to resolve increment values below the
> precision of the mantissa as the count increases.  If you need a floating
point
> output from the counter, you would use a fixed point counter of the
appropriate
> width followed by a normalizing barrel shifter.
>
> Trevor Landon wrote:
>
> > While we are on the FP in FPGA discussion...
> >
> > What algorithms exist for floating point counters?  I would like to
> > impliment a 10 bit mantissa, 5 bit exponent FP counter that could
incriment
> > in roughly 8 clock cycles.
> >
> > I have very little interest in using a full FP adder for the obvious
> > reasons.
> >
> > -Trevor Landon
> > landont@ttc.com
> >
> > Jan Gray <jsgray@acm.org.nospam> wrote in message
> > news:DZrf3.21$qH2.1013@paloalto-snr1.gtei.net...
> > > Roland Paterson-Jones wrote in message
<377DC508.D5F1D048@bigfoot.com>...
> > > >It has been variously stated that fpga's are no good for floating
point
> > > >operations. Why? As I see it, floating point operations are typically
> > > >just shifted integer operations. Is the bit-width problematic?
> > >
> > > For 16-bit floats with (say) 10 bit mantissas, FPGAs should be *great*
for
> > > floating point.  Indeed problems start with the wider bit-widths.  The
> > > area-expensive (and worse than linear scaling) FP components are the
> > barrel
> > > shifters needed for pre-add mantissa operand alignment and post-add
> > > normalization in the FP adder, and of course the FP multiplier array.
> > >
> > > The FCCM papers on this subject include:
> > >
> > > Ligon et al, A Re-evaluation of the practicality of floating-point
> > > operations on FPGAs, FCCM 1998
> > >
> > > Louca et al, Implementation of IEEE single precision floating point
> > addition
> > > and multiplication on FPGAs, FCCM 1996
> > >
> > > Shirazi et al, Quantitative analysis of floating point arithmetic on
FPGA
> > > based custom computing machines, FCCM 1995
> > >
> > > and the neat Leong paper on avoiding the problem entirely,
> > >
> > > Leong et al, Automating floating to fixed point translation and its
> > > application to post-rendering 3D warping, FCCM 1999
> > >
> > >
> > > See the Ligon paper for a nice presentation of speed-area tradeoffs of
> > > various implementation choices.  Ligon estimates their
single-precision FP
> > > adder resource use at between 563 and 629 LUTs -- 36-40% of a XC4020E.
> > Note
> > > this group used a synthesis tool; a hand-mapped design could be
smaller.
> > >
> > > Put another way, that single precision FP adder is almost twice the
area
> > of
> > > a pipelined 32-bit RISC datapath.  Ouch.
> > >
> > >
> > > The rest of this article explores ideas for slower-but-smaller FP
adders.
> > >
> > > The two FP add barrel shifters are the problem.  They each need many
LUTs
> > > and much interconnect.  For example, a w-bit-wide barrel shifter is
often
> > > implemented as lg w stages of w-bit 2-1 muxes, optionally pipelined.
> > >
> > > Example 1: single-precision in << s, w=24
> > >   m0 = s[0] ? in[22:0] << 1 : in;
> > >   m1 = s[1] ? m0[21:0] << 2: m0;
> > >   m2 = s[2] ? m1[19:0] << 4 : m1;
> > >   m3 = s[3] ? m2[15:0] << 8 : m2;  // 16 wires 8 high
> > >   out = s[4] ? m3[7:0] << 16 : m3; // 8 wires 16 high
> > > ----
> > > 5*24 2-1 muxes = 120 LUTs
> > >
> > > Example 2: double-precision in << s, w=53
> > >   m0 = s[0] ? in[51:0] << 1 : in;
> > >   m1 = s[1] ? m0[50:0] << 2: m0;
> > >   m2 = s[2] ? m1[48:0] << 4 : m1;
> > >   m3 = s[3] ? m2[44:0] << 8 : m2; // 45 wires 8 high
> > >   m4 = s[4] ? m3[36:0] << 16 : m3; // 37 wires 16 high
> > >   out = s[5] ? m4[20:0] << 32 : m4; // 21 wires 32 high
> > > ----
> > > 6*53 2-1 muxes = 318 LUTs
> > >
> > > In a horizontally oriented datapath, the last few mux stages have many
> > > vertical wires, each many LUTs high.  This is more vertical
interconnect
> > > than is available in one column of LUTs/CLBs, so the actual area can
be
> > much
> > > worse than the LUT count indicates.
> > >
> > >
> > > BUT we can of course avoid the barrel shifters, and do FP
> > > denormalization/renormalization iteratively.
> > >
> > > Idea #1: Replace the barrel shifters with early-out iterative
shifters.
> > For
> > > example, build a registered 4-1 mux: w = mux(in, w<<1, w<<3, w<<7).
Then
> > an
> > > arbitrary 24-bit shift can be done in 5 cycles or less in ~1/3 of the
> > area.
> > > For double precision, make it something like w = mux(in, w<<1, w<<4,
> > w<<12),
> > > giving an arbitrary 53-bit shift in 8 cycles.
> > >
> > >
> > > Idea #2: (half baked and sketchy) Do FP addition in a bit- or
> > nibble-serial
> > > fashion.
> > >
> > > To add A+B, you
> > >
> > > 1) compare exponents A.exp and B.exp;
> > > 2) serialize A.mant and B.mant, LSB first;
> > > 3) swap (using 2 2-1 muxes) lsb-serial(A.mant) and lsb-serial(B.mant)
if
> > > A.exp < B.exp
> > > 4) delay lsb-serial(A.mant) in a w-bit FIFO for abs(A.exp-B.exp)
cycles;
> > > 5) bit-serial-add delay(lsb-serial(A.mant)) + lsb-serial(B.mant) for w
> > > cycles
> > > 6) collect in a "sum.mant" shift register
> > > 7) shift up to w-1 cycles (until result mantissa is normalized).
> > >
> > > It may be that steps 4 and 6 are quite cheap, using Virtex 4-LUTs in
shift
> > > register mode -- they're variable tap, right?
> > >
> > > It is interesting to consider eliminating steps 2, 6, and 7, by
keeping
> > your
> > > FP mantissa values in the serialized representation between
operations,
> > > counting clocks since last sum-1-bit seen, and then normalizing
(exponent
> > > adjustment only) and aligning *both* operands (via swap/delay) on
input to
> > > the next FP operation.  A big chained data computation might exploit
many
> > > serially interconnected serial FP adders and serial FP multipliers...
> > >
> > > Is this approach better (throughput/area) than a traditional pipelined
> > > word-oriented FP datapath?  Probably not, I don't know.  But if your
FP
> > > needs are modest (Mflops not 100 Mflops) this approach should permit
quite
> > > compact FP hardware.
> > >
> > > (Philip Freidin and I discussed this at FCCM99.  Thanks Philip.)
> > >
> > > Jan Gray
> > >
> > >
> > >
>
>
>
> --
> -Ray Andraka, P.E.
> President, the Andraka Consulting Group, Inc.
> 401/884-7930     Fax 401/884-7950
> email randraka@ids.net
> http://users.ids.net/~randraka
>
>

Article: 17178
Subject: Re: ALTERA GDF to VHDL QUESTION
From: "Mark Grindell" <petejackson7@hotmail.com>
Date: Wed, 7 Jul 1999 16:44:40 +0100
Links: << >> << T >> << A >>

I wouldn't expect an automatic procedure, and I think from the tone of your
message that you are expecting something like that.

The general procedure with something like this is to write your VHDL in
sections which reflect your GDF in its most general form. You would create
components where you have separate GDFs and basically create a VHDL
hierarchy in parallel to your GDF structure.

This is the easy bit, just fill in the "component is" and "architecture is"
declarations withthe signals you find in the GDF.

Then you have to work out how each GDF works, and try to express this in
VHDL using clauses which don't involve a clock at all (combinatorial
blocks), or blocks which execute on a clock edge.

The code I would write would involve condition statements, such as if, or
case, and so forth, determining the state of each signal at a time. That way
you can separate the whole circuit into signals and their sources.

This is made a lot easier if you really understand the GDFs, maybe having
worked with them for a bit. Of course, in VHDL, you can comment the thing
far more effectively.

If the design has a lot of flip flops with asynchronous inputs used heavily,
you would probably be advised to use one process per flip flop (or set of
flip flops if they all implement a bus), since the sensitivity list should
contain the clocks and asynchronous controls. I think this is certainly the
best way of doing it in Max Plus II.

Good luck

Mark

Asher C. Martin <martin2@acm.uiuc.edu> wrote in message
news:3777D761.4830D62@acm.uiuc.edu...
> Greetings,
>
> I am doing some undergraduate research this summer at Beckman Institute
> and I am working on some VHDL code to control an analog to digital
> converter for various sensors on a robot.
>
> I am fairly new to ALTERA's MAX+PLUS II software and have a question
> regarding how to convert GDF files to straight VHDL.  I would like to
> know if it is possible to tern a GDF file into a VHDL file.
>
> Any suggestions...?
>
> Best regards,
>
> >Asher<
> (Undergraduate students @ UIUC)
>
> <<=>>=<<=>>=<<=>><<=>>=<<=>>=<<=>>
>  Asher C. Martin
>  805 West Oregon Street
>  Urbana, IL 61801-3825
>  (217) 367-3877
>  E-MAIL: martin2@acm.uiuc.edu
>  http://fermi.isdn.uiuc.edu
>  telnet://fermi.isdn.uiuc.edu
>  ftp://feynman.isdn.uiuc.edu
> <<=>>=<<=>>=<<=>><<=>>=<<=>>=<<=>>

Article: 17179
Subject: Re: 100 Billion operations per sec.!
From: "Mark Grindell" <petejackson7@hotmail.com>
Date: Wed, 7 Jul 1999 17:08:48 +0100
Links: << >> << T >> << A >>

Yes.

I thought like the other guys, that this was all marketing hype. This decade
has seen so much of that in so many scenarios, and then there has been the
Asian financial collapse, and with all this together, I am surprised that
there is much venture capital left.

Add this all together and you don't have much prospect of much happening on
a fundamental level. Then I remembered. There was a bunch of guys working in
IBM around the very early eighties, I think they included Goguen, Thatcher,
and Wright, to name three, who were working in Europe in what used to be
called denotational semantics - directed towards compiler generation.

The programs they were engaged in were very ambitious, to say the least, and
the bulk of their work was published either as "private" IBM releases , of
which I think I still might have a couple of copies, or in some Springer
Verlag Lecture notes in Computer Science series issues.

Among some of the papers were some very interesting papers on developments
in Category Theory to make isomorphisms between one type of domain (for
instance, predicate calculus), and another, (for instance, a real computer
language). They were having difficulties with the expression of temporal
models this way, but as far as could understand this was being tackled by a
separate group somewhere in England. I can't remember who was doing this,
unless it was Tony Hoare.

All this was quite some time ago, 15 years or so, and I got married just
after that.

But its interesting, you know. There have been some advances in category
theory anyhow that might have lead to some very exciting possibilities with
targeting programming languages via those kind of transformations to the
FPGA semantics.

Admittedly, at the time they were talking mostly about proof systems for
compilers, but its been 15 years, and it's been very quiet in this area, and
you never know.

Regards

Mark Grindell


Robert K. Veazey Jr. <rveazey@bellsouth.net> wrote in message
news:3773EC87.C882C349@bellsouth.net...
> I never heard of FPGA's until I recently read an article at CNN's
> website <http://www.cnn.com/TECH/computing/9906/15/supercomp.idg/> that
> said a company called Star Bridge Systems, Inc.
> <http://www.starbridgesystems.com> has developed a revolutionary
> computer using FPGA's that sits on a desktop & plugs into a 120v
> standard outlet but outperforms IBM's fastest supercomputer called
> Pacific Blue (which takes up 8,000sq.ft. floor space and uses something
> like 3 Megawatts of power) by many times. They go on to say that this
> company will be selling $1000 desktop computers in about 18 months that
> are THOUSANDS of times faster than a 350PII Intel processor based
> desktop. Here are my questions:
>
> You guys(and gals) have been using FPGA's and programming for them for
> some time now. What do all think of these claims?
>
> Do you think it would be benificial to learn to program in their
> proprietory language(called "Viva")? They will be offering an online
> university for this purpose this fall with initial tutorials being free.
> They claim that their technology is such a breakthrough that it will
> literally replace most known types of AISC's and microprocessors
> quickly.
>
> Let me know what you think.
>
>                             Thanks,
>
>                                     Bob
>

Article: 17180
Subject: Re: 100 Billion operations per sec.!
From: "Mark Grindell" <petejackson7@hotmail.com>
Date: Wed, 7 Jul 1999 17:32:10 +0100
Links: << >> << T >> << A >>

Just another note.

I read the first post again, I didn't notice until a bit later that they
actually have a special language for the code to be run on their target
hardware. That's rather interesting, isn't it? There were concerns for a
long time that conventional languages were too hard to put into denotational
semantics form, and therefore for a long time, there were "toy" languages
with specific properties and restrictions, which no-one took seriously at
all. But that may well have changed.

Also, the quest for "languages to suit the hardware" didn't stop in the
seventies or eighties. I heard that some signal processing languages have
been created fairly recently (there is one called "SIGNAL" which has a
French pedigree) in which there are constructs rather close to the hardware
view of things. There was an IEEE proceedings on three languages of this
sort about four years ago. May have been longer.

Damn, all my papers on this are in Australia, and I'm in England!

I still think there's just a small chance something may really have
happened.

Robert K. Veazey Jr. <rveazey@bellsouth.net> wrote in message
news:3773EC87.C882C349@bellsouth.net...
> I never heard of FPGA's until I recently read an article at CNN's
> website <http://www.cnn.com/TECH/computing/9906/15/supercomp.idg/> that
> said a company called Star Bridge Systems, Inc.
> <http://www.starbridgesystems.com> has developed a revolutionary
> computer using FPGA's that sits on a desktop & plugs into a 120v
> standard outlet but outperforms IBM's fastest supercomputer called
> Pacific Blue (which takes up 8,000sq.ft. floor space and uses something
> like 3 Megawatts of power) by many times. They go on to say that this
> company will be selling $1000 desktop computers in about 18 months that
> are THOUSANDS of times faster than a 350PII Intel processor based
> desktop. Here are my questions:
>
> You guys(and gals) have been using FPGA's and programming for them for
> some time now. What do all think of these claims?
>
> Do you think it would be benificial to learn to program in their
> proprietory language(called "Viva")? They will be offering an online
> university for this purpose this fall with initial tutorials being free.
> They claim that their technology is such a breakthrough that it will
> literally replace most known types of AISC's and microprocessors
> quickly.
>
> Let me know what you think.
>
>                             Thanks,
>
>                                     Bob
>

Article: 17181
Subject: Re: ENJOY MY AMATEUR WEB SITE.
From: Brian Boorman <XZY.bboorman@harris.com>
Date: Wed, 07 Jul 1999 12:42:08 -0400
Links: << >> << T >> << A >>

Oh, hilareously funny! I didn't understand one word of your non-English
nonsense.

Eneko M.F. wrote:
> 
> --
> VERY FUNNY  WEB SITE.
> ENJOY MY AMATEUR WEB SITE.
> http://members.es.tripod.de/enekom/index.htm

-- 
Brian C. Boorman
Harris RF Communications
Rochester, NY 14610
XYZ.bboorman@harris.com
<Remove the XYZ. for valid address>

Article: 17182
Subject: Re: Tristate Register in Xilinx 4000XLA IO block
From: Brian Philofsky <brianp@xilinx.com>
Date: Wed, 07 Jul 1999 09:52:49 -0700
Links: << >> << T >> << A >>

This is a multi-part message in MIME format.
--------------C567A068A42E081D98184E45
Content-Type: multipart/alternative;
 boundary="------------A921C689E592AC33D5C8B2E1"

--------------A921C689E592AC33D5C8B2E1
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

There is an application note on our web site entitled, "Using Three-State
Enable Registers in XLA, XV, and SpartanXL FPGAs."  The link is at
http://www.xilinx.com/xapp/xapp123.pdf .  It should hopefully answer your
question.

By the way, I found this application note using the search engine on the
Xilinx Support site, http://support.xilinx.com .

--  Brian

Hermann Winkler wrote:

> The 4000XLA IOB contains a "tristate register" that can disable the
> tristate output buffer (OBUFT). Opening a 4000XLA design in EPIC
> and looking into any IO block shows this register. In EPIC it is called
> "TRIFF".
> But if I instantiate a primitive element TRIFF in an XNF file, then
> 'ngdbuild'
> does not recognize ist.
>
> Xilinx Support didn't even know anything about this register.
>
> How can I use the new tristate flip-flop in the 4000XLA IOB?

--
-------------------------------------------------------------------
 / 7\'7 Brian Philofsky   (brian.philofsky@xilinx.com)
 \ \ `  Xilinx Applications Engineer             hotline@xilinx.com
 / /    2100 Logic Drive                         1-800-255-7778
 \_\/.\ San Jose, California 95124-3450          1-408-879-5199
-------------------------------------------------------------------

--------------A921C689E592AC33D5C8B2E1
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
&nbsp;
<p>There is an application note on our web site entitled, "Using Three-State
Enable Registers in XLA, XV, and SpartanXL FPGAs."&nbsp; The link is at
<A HREF="http://www.xilinx.com/xapp/xapp123.pdf">http://www.xilinx.com/xapp/xapp123.pdf</A> .&nbsp; It should hopefully answer
your question.
<p>By the way, I found this application note using the search engine on
the Xilinx Support site, <A HREF="http://support.xilinx.com">http://support.xilinx.com</A> .
<br>&nbsp;
<p>--&nbsp; Brian
<br>&nbsp;
<p>Hermann Winkler wrote:
<blockquote TYPE=CITE>The 4000XLA IOB contains a "tristate register" that
can disable the
<br>tristate output buffer (OBUFT). Opening a 4000XLA design in EPIC
<br>and looking into any IO block shows this register. In EPIC it is called
<br>"TRIFF".
<br>But if I instantiate a primitive element TRIFF in an XNF file, then
<br>'ngdbuild'
<br>does not recognize ist.
<p>Xilinx Support didn't even know anything about this register.
<p>How can I use the new tristate flip-flop in the 4000XLA IOB?</blockquote>

<pre>--&nbsp;
-------------------------------------------------------------------
&nbsp;/ 7\'7 Brian Philofsky&nbsp;&nbsp; (brian.philofsky@xilinx.com)
&nbsp;\ \ `&nbsp; Xilinx Applications Engineer&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hotline@xilinx.com
&nbsp;/ /&nbsp;&nbsp;&nbsp; 2100 Logic Drive&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1-800-255-7778&nbsp;
&nbsp;\_\/.\ San Jose, California 95124-3450&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1-408-879-5199&nbsp;
-------------------------------------------------------------------</pre>
&nbsp;</html>

--------------A921C689E592AC33D5C8B2E1--

--------------C567A068A42E081D98184E45
Content-Type: text/x-vcard; charset=us-ascii;
 name="brianp.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for Brian Philofsky
Content-Disposition: attachment;
 filename="brianp.vcf"

begin:vcard 
n:Philofsky;Brian
tel;fax:(408) 879-4442
tel;work:1-800-255-7778
x-mozilla-html:TRUE
org:<BR><H1 ALIGN="CENTER"><img src="http://www.xilinx.com/images/xlogoc.gif" alt="Xilinx" ALIGN="CENTER">  Design Center
version:2.1
email;internet:brianp@xilinx.com
title:<H3 ALIGN="CENTER"><img src="http://bennyhills.fortunecity.com/deadparrot/108/homer.gif" alt="Homer" align="center"> Application Engineer 
adr;quoted-printable:;;2100 Logic Drive=0D=0ADept. 2510;San Jose;CA;95124-3450;USA
x-mozilla-cpt:;15200
fn:<H3 ALIGN="CENTER">Brian Philofsky
end:vcard

--------------C567A068A42E081D98184E45--

Article: 17183
Subject: Re: Floating point on fpga, Counters?
From: Peter Alfke <peter@xilinx.com>
Date: Wed, 07 Jul 1999 09:57:25 -0700
Links: << >> << T >> << A >>

I agree with Ray that the best solution is a fixed-point counter plus a
converter to floating point. The complexity of that converter depends on the
required speed of conversion. In the simplest case, this might be a slow output
routine, after the counter has finished counting. That could be done by a
sequential normalization at very low cost ( essentially just a 5-bit counter).
The most demanding application would normalize on the fly and require a
combinatorial shifter which is quite expensive.
What is the intent? Just curious.

Peter Alfke
=====================
Trevor Landon wrote:

> This is very similar to the approach that I first thought of.  The main
> problem with it is the need for the barrel shifter to normalize the output.
> (and the associated cost of implimenting it in an FPGA)
>
> I guess heart of my question was whether there was a cleaver algorithm to
> simplify this problem.  (Perhaps keeping a full sized counter in the
> background, and adding/single bit shifting the mantissa as needed)
>
> Thanks,
> Trevor Landon
>
> Ray Andraka <randraka@ids.net> wrote in message
> news:37828FFE.9BCE92F7@ids.net...
> > I think the counter has to be fixed point with as many bits as is required
> to
> > represent the maximum count desired.  If it were floating point, you'd
> need a
> > complementary counter underneath it to resolve increment values below the
> > precision of the mantissa as the count increases.  If you need a floating
> point
> > output from the counter, you would use a fixed point counter of the
> appropriate
> > width followed by a normalizing barrel shifter.
> >
> > Trevor Landon wrote:
> >
> > > While we are on the FP in FPGA discussion...
> > >
> > > What algorithms exist for floating point counters?  I would like to
> > > impliment a 10 bit mantissa, 5 bit exponent FP counter that could
> incriment
> > > in roughly 8 clock cycles.
> > >
> > > I have very little interest in using a full FP adder for the obvious
> > > reasons.
> > >
> > > -Trevor Landon
> > > landont@ttc.com
> > >
> > > Jan Gray <jsgray@acm.org.nospam> wrote in message
> > > news:DZrf3.21$qH2.1013@paloalto-snr1.gtei.net...
> > > > Roland Paterson-Jones wrote in message
> <377DC508.D5F1D048@bigfoot.com>...
> > > > >It has been variously stated that fpga's are no good for floating
> point
> > > > >operations. Why? As I see it, floating point operations are typically
> > > > >just shifted integer operations. Is the bit-width problematic?
> > > >
> > > > For 16-bit floats with (say) 10 bit mantissas, FPGAs should be *great*
> for
> > > > floating point.  Indeed problems start with the wider bit-widths.  The
> > > > area-expensive (and worse than linear scaling) FP components are the
> > > barrel
> > > > shifters needed for pre-add mantissa operand alignment and post-add
> > > > normalization in the FP adder, and of course the FP multiplier array.
> > > >
> > > > The FCCM papers on this subject include:
> > > >
> > > > Ligon et al, A Re-evaluation of the practicality of floating-point
> > > > operations on FPGAs, FCCM 1998
> > > >
> > > > Louca et al, Implementation of IEEE single precision floating point
> > > addition
> > > > and multiplication on FPGAs, FCCM 1996
> > > >
> > > > Shirazi et al, Quantitative analysis of floating point arithmetic on
> FPGA
> > > > based custom computing machines, FCCM 1995
> > > >
> > > > and the neat Leong paper on avoiding the problem entirely,
> > > >
> > > > Leong et al, Automating floating to fixed point translation and its
> > > > application to post-rendering 3D warping, FCCM 1999
> > > >
> > > >
> > > > See the Ligon paper for a nice presentation of speed-area tradeoffs of
> > > > various implementation choices.  Ligon estimates their
> single-precision FP
> > > > adder resource use at between 563 and 629 LUTs -- 36-40% of a XC4020E.
> > > Note
> > > > this group used a synthesis tool; a hand-mapped design could be
> smaller.
> > > >
> > > > Put another way, that single precision FP adder is almost twice the
> area
> > > of
> > > > a pipelined 32-bit RISC datapath.  Ouch.
> > > >
> > > >
> > > > The rest of this article explores ideas for slower-but-smaller FP
> adders.
> > > >
> > > > The two FP add barrel shifters are the problem.  They each need many
> LUTs
> > > > and much interconnect.  For example, a w-bit-wide barrel shifter is
> often
> > > > implemented as lg w stages of w-bit 2-1 muxes, optionally pipelined.
> > > >
> > > > Example 1: single-precision in << s, w=24
> > > >   m0 = s[0] ? in[22:0] << 1 : in;
> > > >   m1 = s[1] ? m0[21:0] << 2: m0;
> > > >   m2 = s[2] ? m1[19:0] << 4 : m1;
> > > >   m3 = s[3] ? m2[15:0] << 8 : m2;  // 16 wires 8 high
> > > >   out = s[4] ? m3[7:0] << 16 : m3; // 8 wires 16 high
> > > > ----
> > > > 5*24 2-1 muxes = 120 LUTs
> > > >
> > > > Example 2: double-precision in << s, w=53
> > > >   m0 = s[0] ? in[51:0] << 1 : in;
> > > >   m1 = s[1] ? m0[50:0] << 2: m0;
> > > >   m2 = s[2] ? m1[48:0] << 4 : m1;
> > > >   m3 = s[3] ? m2[44:0] << 8 : m2; // 45 wires 8 high
> > > >   m4 = s[4] ? m3[36:0] << 16 : m3; // 37 wires 16 high
> > > >   out = s[5] ? m4[20:0] << 32 : m4; // 21 wires 32 high
> > > > ----
> > > > 6*53 2-1 muxes = 318 LUTs
> > > >
> > > > In a horizontally oriented datapath, the last few mux stages have many
> > > > vertical wires, each many LUTs high.  This is more vertical
> interconnect
> > > > than is available in one column of LUTs/CLBs, so the actual area can
> be
> > > much
> > > > worse than the LUT count indicates.
> > > >
> > > >
> > > > BUT we can of course avoid the barrel shifters, and do FP
> > > > denormalization/renormalization iteratively.
> > > >
> > > > Idea #1: Replace the barrel shifters with early-out iterative
> shifters.
> > > For
> > > > example, build a registered 4-1 mux: w = mux(in, w<<1, w<<3, w<<7).
> Then
> > > an
> > > > arbitrary 24-bit shift can be done in 5 cycles or less in ~1/3 of the
> > > area.
> > > > For double precision, make it something like w = mux(in, w<<1, w<<4,
> > > w<<12),
> > > > giving an arbitrary 53-bit shift in 8 cycles.
> > > >
> > > >
> > > > Idea #2: (half baked and sketchy) Do FP addition in a bit- or
> > > nibble-serial
> > > > fashion.
> > > >
> > > > To add A+B, you
> > > >
> > > > 1) compare exponents A.exp and B.exp;
> > > > 2) serialize A.mant and B.mant, LSB first;
> > > > 3) swap (using 2 2-1 muxes) lsb-serial(A.mant) and lsb-serial(B.mant)
> if
> > > > A.exp < B.exp
> > > > 4) delay lsb-serial(A.mant) in a w-bit FIFO for abs(A.exp-B.exp)
> cycles;
> > > > 5) bit-serial-add delay(lsb-serial(A.mant)) + lsb-serial(B.mant) for w
> > > > cycles
> > > > 6) collect in a "sum.mant" shift register
> > > > 7) shift up to w-1 cycles (until result mantissa is normalized).
> > > >
> > > > It may be that steps 4 and 6 are quite cheap, using Virtex 4-LUTs in
> shift
> > > > register mode -- they're variable tap, right?
> > > >
> > > > It is interesting to consider eliminating steps 2, 6, and 7, by
> keeping
> > > your
> > > > FP mantissa values in the serialized representation between
> operations,
> > > > counting clocks since last sum-1-bit seen, and then normalizing
> (exponent
> > > > adjustment only) and aligning *both* operands (via swap/delay) on
> input to
> > > > the next FP operation.  A big chained data computation might exploit
> many
> > > > serially interconnected serial FP adders and serial FP multipliers...
> > > >
> > > > Is this approach better (throughput/area) than a traditional pipelined
> > > > word-oriented FP datapath?  Probably not, I don't know.  But if your
> FP
> > > > needs are modest (Mflops not 100 Mflops) this approach should permit
> quite
> > > > compact FP hardware.
> > > >
> > > > (Philip Freidin and I discussed this at FCCM99.  Thanks Philip.)
> > > >
> > > > Jan Gray
> > > >
> > > >
> > > >
> >
> >
> >
> > --
> > -Ray Andraka, P.E.
> > President, the Andraka Consulting Group, Inc.
> > 401/884-7930     Fax 401/884-7950
> > email randraka@ids.net
> > http://users.ids.net/~randraka
> >
> >

Article: 17184
Subject: Re: Programming Xilinx without Foundation
From: Brian Philofsky <brianp@xilinx.com>
Date: Wed, 07 Jul 1999 10:03:05 -0700
Links: << >> << T >> << A >>

This is a multi-part message in MIME format.
--------------86697C9D479497D17CAECBA5
Content-Type: multipart/alternative;
 boundary="------------CF84EFF07B6CB3D751B5DD82"

--------------CF84EFF07B6CB3D751B5DD82
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

There is a DOS-version of the JTAG Programmer called, jtagprog.exe.  You can
run it in a batch mode which should allow a one-command interface to
program, erase, etc. to the 9500 device.  More information on this can be
found in the docs:

http://support.xilinx.com/support/sw_manuals/2_1i/index.htm

or more specifically

http://toolbox.xilinx.com/docsan/2_1i/data/alliance/jtg/app5.htm

Hopefully this will solve your problems.

--  Brian

mel wrote:

> For development I'm quite happy programming the Xilinx 9536 using the
> Foundation software. However, now that I have two finished designs, I'm
> finding that the production department are confused by the Foundation
> software (they're quite happy wielding soldering irons about but they're
> not very PC literate).
>
> So, I'm wondering if there's an easy way to embed the jedec file and
> programming algorithm into a DOS program. This would provide the simplest
> possible way of programming an on-board device via the PC printer port
> programmer.
>
> --
>
>     /Mel/ (at work)

--
-------------------------------------------------------------------
 / 7\'7 Brian Philofsky   (brian.philofsky@xilinx.com)
 \ \ `  Xilinx Applications Engineer             hotline@xilinx.com
 / /    2100 Logic Drive                         1-800-255-7778
 \_\/.\ San Jose, California 95124-3450          1-408-879-5199
-------------------------------------------------------------------

--------------CF84EFF07B6CB3D751B5DD82
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
&nbsp;
<p>There is a DOS-version of the JTAG Programmer called, jtagprog.exe.&nbsp;
You can run it in a batch mode which should allow a one-command interface
to program, erase, etc. to the 9500 device.&nbsp; More information on this
can be found in the docs:
<p><A HREF="http://support.xilinx.com/support/sw_manuals/2_1i/index.htm">http://support.xilinx.com/support/sw_manuals/2_1i/index.htm</A>
<p>or more specifically
<p><A HREF="http://toolbox.xilinx.com/docsan/2_1i/data/alliance/jtg/app5.htm">http://toolbox.xilinx.com/docsan/2_1i/data/alliance/jtg/app5.htm</A>
<br>&nbsp;
<p>Hopefully this will solve your problems.
<br>&nbsp;
<p>--&nbsp; Brian
<br>&nbsp;
<p>mel wrote:
<blockquote TYPE=CITE>For development I'm quite happy programming the Xilinx
9536 using the
<br>Foundation software. However, now that I have two finished designs,
I'm
<br>finding that the production department are confused by the Foundation
<br>software (they're quite happy wielding soldering irons about but they're
<br>not very PC literate).
<p>So, I'm wondering if there's an easy way to embed the jedec file and
<br>programming algorithm into a DOS program. This would provide the simplest
<br>possible way of programming an on-board device via the PC printer port
<br>programmer.
<p>--
<p>&nbsp;&nbsp;&nbsp; /Mel/ (at work)</blockquote>

<pre>--&nbsp;
-------------------------------------------------------------------
&nbsp;/ 7\'7 Brian Philofsky&nbsp;&nbsp; (brian.philofsky@xilinx.com)
&nbsp;\ \ `&nbsp; Xilinx Applications Engineer&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hotline@xilinx.com
&nbsp;/ /&nbsp;&nbsp;&nbsp; 2100 Logic Drive&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1-800-255-7778&nbsp;
&nbsp;\_\/.\ San Jose, California 95124-3450&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1-408-879-5199&nbsp;
-------------------------------------------------------------------</pre>
&nbsp;</html>

--------------CF84EFF07B6CB3D751B5DD82--

--------------86697C9D479497D17CAECBA5
Content-Type: text/x-vcard; charset=us-ascii;
 name="brianp.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for Brian Philofsky
Content-Disposition: attachment;
 filename="brianp.vcf"

begin:vcard 
n:Philofsky;Brian
tel;fax:(408) 879-4442
tel;work:1-800-255-7778
x-mozilla-html:TRUE
org:<BR><H1 ALIGN="CENTER"><img src="http://www.xilinx.com/images/xlogoc.gif" alt="Xilinx" ALIGN="CENTER">  Design Center
version:2.1
email;internet:brianp@xilinx.com
title:<H3 ALIGN="CENTER"><img src="http://bennyhills.fortunecity.com/deadparrot/108/homer.gif" alt="Homer" align="center"> Application Engineer 
adr;quoted-printable:;;2100 Logic Drive=0D=0ADept. 2510;San Jose;CA;95124-3450;USA
x-mozilla-cpt:;15200
fn:<H3 ALIGN="CENTER">Brian Philofsky
end:vcard

--------------86697C9D479497D17CAECBA5--

Article: 17185
Subject: Alto in an FPGA (was CPU's directly executing HLL's)
From: "Jan Gray" <jsgray@acm.org.nospam>
Date: Wed, 07 Jul 1999 17:22:50 GMT
Links: << >> << T >> << A >>

Paul Wallich wrote in message ...
>It's a little amusing to note that the emulator, the thread executing the
user's
>program, was actually the lowest-priority thread. (Also amusing to
>think that Alto micromachine was something like 1600 gates -- you
>could build dozens of them on a single FPGA).

Perhaps, but if you count the register files and constant and microcode
memory it was much larger than 1600 gates.

A while back (around Alto's 25th anniversary) I briefly considered
implementing an Alto in a Xilinx XC4000 FPGA.  A 1979 era Alto processor,
*excluding microcode memory*, requires approximately 400 configurable logic
blocks (CLBs):

CLBs  What
----  ----
16    32x16-bit R registers
128   8x32x16-bit S registers  (1979 Alto)
(16    32x16-bit S registers (1974 Alto))
128   256x16-bit constant memory
64?   rest of datapath
64?   control
(4096  4096x32-bit microcode control memory)
----
~400 CLBs + lots of TBUFs (the 16-bit "processor bus" is driven by 9+
sources)

This would probably fill a 24x24 CLB Xilinx XCS30XL.  Perhaps you could
include processor and equivalent I/O controllers in an XCS40XL.

Now Xilinx has introduced their Virtex device family, which features 8+
256x16 dual port embedded SRAM blocks.  You could implement the S registers
in one block ram, the constant memory in another.  A 2KW subset of the 4KW
control memory would require 16 more, but would still fit in one of the
larger Virtex devices.

ref: Thacker et al, Alto: A Personal Computer, chapter 33 in Siewiorek et
al, Computer Structures: Principles and Examples, McGraw-Hill, 1982

BTW, you can theoretically build dozens of simple CPUs in a single FPGA: see
discussion thread at http://deja.com/getdoc.xp?AN=277216882 (XC4085XL) and
also http://deja.com/getdoc.xp?AN=444640841 (Virtex).

Jan Gray

Article: 17186
Subject: Re: Alto in an FPGA (was CPU's directly executing HLL's)
From: pw@panix.com (Paul Wallich)
Date: Wed, 07 Jul 1999 16:20:48 -0400
Links: << >> << T >> << A >>

In article <K1Mg3.12$4H4.1291@paloalto-snr1.gtei.net>, "Jan Gray"
<jsgray@acm.org.nospam> wrote:

>Paul Wallich wrote in message ...
>>It's a little amusing to note that the emulator, the thread executing the
>user's
>>program, was actually the lowest-priority thread. (Also amusing to
>>think that Alto micromachine was something like 1600 gates -- you
>>could build dozens of them on a single FPGA).
>
>Perhaps, but if you count the register files and constant and microcode
>memory it was much larger than 1600 gates.

But should you? One of the things that's pretty clear is that the 
micromachine remained relatively constant while the register
files and control stores got mucked around. Given the bandwidth
requirements (60 MHz by essentially 3 x 32 bits) you could put
everything offchip easily enough. And at the time (MSI) the
partitioning seemed clear...

(I'm only sort of kidding -- the question of what the CPU is goes
right along with the question of what language it "directly
executes".)

>A while back (around Alto's 25th anniversary) I briefly considered
>implementing an Alto in a Xilinx XC4000 FPGA.  A 1979 era Alto processor,
>*excluding microcode memory*, requires approximately 400 configurable logic
>blocks (CLBs):
>
>CLBs  What
>----  ----
>16    32x16-bit R registers
>128   8x32x16-bit S registers  (1979 Alto)
>(16    32x16-bit S registers (1974 Alto))
>128   256x16-bit constant memory
>64?   rest of datapath
>64?   control
>(4096  4096x32-bit microcode control memory)
>----
>~400 CLBs + lots of TBUFs (the 16-bit "processor bus" is driven by 9+
>sources)
>
>This would probably fill a 24x24 CLB Xilinx XCS30XL.  Perhaps you could
>include processor and equivalent I/O controllers in an XCS40XL.
>
>Now Xilinx has introduced their Virtex device family, which features 8+
>256x16 dual port embedded SRAM blocks.  You could implement the S registers
>in one block ram, the constant memory in another.  A 2KW subset of the 4KW
>control memory would require 16 more, but would still fit in one of the
>larger Virtex devices.
>
>ref: Thacker et al, Alto: A Personal Computer, chapter 33 in Siewiorek et
>al, Computer Structures: Principles and Examples, McGraw-Hill, 1982

Or: in Lavendel et al, eds., A Decade of Research, (pp 224-238), R.R.
Bowker, 1980.

Article: 17187
Subject: PCI interface
From: "Jo Van Langendonck" <jvanlang@nospam.dma.be>
Date: Wed, 7 Jul 1999 22:31:12 +0200
Links: << >> << T >> << A >>

I am about to develop a PCI board for SUN SPARC stations. Our current
designs use the SBUS, which is becoming obsolete. We will need very high
burst transfer rates.
Currently I'm doing a trade-off between AMCC, PLX, Xilinx PCI core,...
Any technical comments?

Jo Van Langendonck
Alcatel Bell Space

Article: 17188
Subject: Re: Floating point on fpga, Counters?
From: ldoolitt@recycle ()
Date: 7 Jul 1999 22:45:49 GMT
Links: << >> << T >> << A >>

Trevor Landon (landont@ttc.com) wrote:

: What algorithms exist for floating point counters?  I would like to
: impliment a 10 bit mantissa, 5 bit exponent FP counter that could incriment
: in roughly 8 clock cycles.

Peter <peter.alfke@xilinx.com> and I discussed this question in e-mail,
here is my idea for doing it in one or two cycles without using excessive
CLB's.  Peter added to the explainations.

You need a long enough register to hold the full resolution count,
even if that involves guard bits beyond the end of the floating
resolution.  The register needs to be able to shift down (one bit
in a cycle), and carry-in at a programmable bit location.  The output
carry would trigger a down-shift, and increment the exponent, and
the carry-in location is selected by the exponent.  Actually,
the down-shift has to be selected in the same cycle that would
otherwise generate the output carry, so that's an extra level
of combinatorics.  Et voila!  No barrel shifter.

The way I imagine it, it's a single cycle device.
In that cycle, you:

  if (the mantissa is all ones) {
        increment the exponent;
        shift the mantissa down;
  }
  increment the mantissa at the bit selected by the exponent;

Both of those are single cycle events, at least to the extent
the FPGA can deal with adequate bit width for the carry chain
and the all-ones logic.  You could probably also run this nicely
in two cycles, to get rid of the separate all-ones calc.

You can look at this (concisely) as an accumulator (really an
incrementer) with a left-shift capabilty and a pointer-controlled
insertion point for the 1 that causes the increment.

       - Larry Doolittle   <LRDoolittle@lbl.gov>

Article: 17189
Subject: Re: Floating point on fpga, Counters?
From: Ray Andraka <randraka@ids.net>
Date: Wed, 07 Jul 1999 19:28:11 -0400
Links: << >> << T >> << A >>

The barrel shift for this case is not terribly expensive: it uses 40 clb's if the
output is 10e5 and the input from the counter is 25 bits.  It can be further
compacted by using the HLUTs.

ldoolitt@recycle wrote:

> Trevor Landon (landont@ttc.com) wrote:
>
> : What algorithms exist for floating point counters?  I would like to
> : impliment a 10 bit mantissa, 5 bit exponent FP counter that could incriment
> : in roughly 8 clock cycles.
>
> Peter <peter.alfke@xilinx.com> and I discussed this question in e-mail,
> here is my idea for doing it in one or two cycles without using excessive
> CLB's.  Peter added to the explainations.
>
> You need a long enough register to hold the full resolution count,
> even if that involves guard bits beyond the end of the floating
> resolution.  The register needs to be able to shift down (one bit
> in a cycle), and carry-in at a programmable bit location.  The output
> carry would trigger a down-shift, and increment the exponent, and
> the carry-in location is selected by the exponent.  Actually,
> the down-shift has to be selected in the same cycle that would
> otherwise generate the output carry, so that's an extra level
> of combinatorics.  Et voila!  No barrel shifter.
>
> The way I imagine it, it's a single cycle device.
> In that cycle, you:
>
>   if (the mantissa is all ones) {
>         increment the exponent;
>         shift the mantissa down;
>   }
>   increment the mantissa at the bit selected by the exponent;
>
> Both of those are single cycle events, at least to the extent
> the FPGA can deal with adequate bit width for the carry chain
> and the all-ones logic.  You could probably also run this nicely
> in two cycles, to get rid of the separate all-ones calc.
>
> You can look at this (concisely) as an accumulator (really an
> incrementer) with a left-shift capabilty and a pointer-controlled
> insertion point for the 1 that causes the increment.
>
>        - Larry Doolittle   <LRDoolittle@lbl.gov>



--
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email randraka@ids.net
http://users.ids.net/~randraka

Article: 17190
Subject: Re: 100 Billion operations per sec.!
From: Ray Andraka <randraka@ids.net>
Date: Wed, 07 Jul 1999 19:30:25 -0400
Links: << >> << T >> << A >>

You need to consider the people involved.  Some of them are known to the FPGA
community.

Mark Grindell wrote:

> Just another note.
>
> I read the first post again, I didn't notice until a bit later that they
> actually have a special language for the code to be run on their target
> hardware. That's rather interesting, isn't it? There were concerns for a
> long time that conventional languages were too hard to put into denotational
> semantics form, and therefore for a long time, there were "toy" languages
> with specific properties and restrictions, which no-one took seriously at
> all. But that may well have changed.
>
> Also, the quest for "languages to suit the hardware" didn't stop in the
> seventies or eighties. I heard that some signal processing languages have
> been created fairly recently (there is one called "SIGNAL" which has a
> French pedigree) in which there are constructs rather close to the hardware
> view of things. There was an IEEE proceedings on three languages of this
> sort about four years ago. May have been longer.
>
> Damn, all my papers on this are in Australia, and I'm in England!
>
> I still think there's just a small chance something may really have
> happened.
>
> Robert K. Veazey Jr. <rveazey@bellsouth.net> wrote in message
> news:3773EC87.C882C349@bellsouth.net...
> > I never heard of FPGA's until I recently read an article at CNN's
> > website <http://www.cnn.com/TECH/computing/9906/15/supercomp.idg/> that
> > said a company called Star Bridge Systems, Inc.
> > <http://www.starbridgesystems.com> has developed a revolutionary
> > computer using FPGA's that sits on a desktop & plugs into a 120v
> > standard outlet but outperforms IBM's fastest supercomputer called
> > Pacific Blue (which takes up 8,000sq.ft. floor space and uses something
> > like 3 Megawatts of power) by many times. They go on to say that this
> > company will be selling $1000 desktop computers in about 18 months that
> > are THOUSANDS of times faster than a 350PII Intel processor based
> > desktop. Here are my questions:
> >
> > You guys(and gals) have been using FPGA's and programming for them for
> > some time now. What do all think of these claims?
> >
> > Do you think it would be benificial to learn to program in their
> > proprietory language(called "Viva")? They will be offering an online
> > university for this purpose this fall with initial tutorials being free.
> > They claim that their technology is such a breakthrough that it will
> > literally replace most known types of AISC's and microprocessors
> > quickly.
> >
> > Let me know what you think.
> >
> >                             Thanks,
> >
> >                                     Bob
> >



--
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email randraka@ids.net
http://users.ids.net/~randraka

Article: 17191
Subject: Re: PCI interface
From: "Austin Franklin" <austin@dark88room.com>
Date: 7 Jul 1999 23:47:29 GMT
Links: << >> << T >> << A >>

The PLX has the best DMA (burst master) system of the three.  If you use
the Xilinx core, you will have to develop your own burst master, which is a
difficult task at best.  The old AMCC chips were dogs, and I haven't tried
any of the new ones.

Be careful with the latest PLX chip though (9054) if you need to use 5V
PCI.  It does not provide any VCCIO pins, and that is in direct violation
with the PCI spec, especially since it is a 3.3V chip, used in a 5V PCI
system.  If your PCI voltage is the same as the voltage of the chip, that
is fine.

Technically, the Xilinx part doesn't meet PCI VCCIO spec either....but at
least you can choose a technology part (3.3 or 5V) that meets your
application.

Austin Franklin
austin@darkroom.com


Jo Van Langendonck <jvanlang@nospam.dma.be> wrote in article
<7m0dnf$f04$1@nickel.uunet.be>...
> I am about to develop a PCI board for SUN SPARC stations. Our current
> designs use the SBUS, which is becoming obsolete. We will need very high
> burst transfer rates.
> Currently I'm doing a trade-off between AMCC, PLX, Xilinx PCI core,...
> Any technical comments?
> 
> Jo Van Langendonck
> Alcatel Bell Space
> 
> 
> 
> 
>

Article: 17192
Subject: Re: 100 Billion operations per sec.!
From: "Austin Franklin" <austin@dark88room.com>
Date: 8 Jul 1999 00:00:42 GMT
Links: << >> << T >> << A >>


> "Modern reconfigurable computing could not begin
> untill Ross Freeman invented the FPGA." With the
> thought that "Modren" distinguishes the current
> type (FPGA based) of reconfigurable computers from
> projects that came before the FPGA.

I even disagree with that.  We were doing re-configurable computing in the
early and mid 70's with bit-slice processors, that had writeable control
stores.  We even used the term 'reconfigurable compute engine'. 
Specifically, we were developing systems for vision processing.

The IBM 370 had a writeable control store...and, was also a
'reconfigurable' computer.

Using a writeable 'memory' (RAM) as generic loadable/programmable logic has
been a standard hardware technique since I started in Engineering in the
mid 70's.  Using ROMs as PLDs has been too.

Did you know Intel patented 'SNMP' 20 years after it was 'invented', even
though they had nothing to do with 'inventing' it?  Some people believe you
just word something differently, and you have invented something new.

Austin Franklin
austin@darkroom.com

Article: 17193
Subject: Re: Simple PCI card prototyping.
From: "Austin Franklin" <austin@dark88room.com>
Date: 8 Jul 1999 00:11:28 GMT
Links: << >> << T >> << A >>

Steve,

Even with assembly code (which should not be an issue, and I don't
understand how assembly code would change what the chip set would do
either...perhaps you could help me understand that) you are still limited
by the architecture of the CPU, not the chip set.  And as such, I have
still not seen near 80M from any Intel CPU (architecture limited) to a PCI
target.  Each Intel CPU is slightly different.

I am still puzzled how you 'saw' what you 'saw'.  Did you look at it on a
PCI analyzer to see the CPU doing more than some 4/16 PCI data cycles?  I
have seem many people claim  things (not saying you are one of them) but
when they actually look at it on a logic analyzer, what they have claimed
is not correct.

What I have seen, when the CPU is being the master, is the data is broken
up into multiple PCI transactions, with the number of data cycles per
transfer being the number the CPU multiple-move instruction will do.

Austin
 
Steven Casselman <sc@vcc.com> wrote in article
<37828714.D118454F@vcc.com>...
> >
> >
> > Not quite what I was looking for.  You said you have seen 80M writing
from
> > the CPU to a PCI target.  I would like to know the specifics under
which
> > you saw 80M/sec.  I have never seen anything close to that, except for
a
> > single CPU 'sized' burst transfer.  Your read numbers seem more in
line.
> >
> > Austin
> 
> You have to use the assembly code to get the Intel chip set
> to aggregate the writes otherwise you will see something
> more in the range of 20-40 meg/sec.
> 
> // word is unsigned int
> 
> void PCICore::write(word addr, word data, word count)
> {
> // lines stuff up
>   word addr = ((addr<<2) | _offset) + _memBase;
> 
>    word *dptr = &data;
> 
>    __asm
>    {
>       push edi
>       push ecx
>       push esi
>       mov esi, dptr
>       mov edi, addr
>       mov ecx, count
>       cld
>       rep movsd
>       pop esi
>       pop ecx
>       pop edi
>    }
> 
> #endif
> 
> }// end write
> 
> 
> 
> 
> --
> Steve Casselman, President
> Virtual Computer Corporation
> http://www.vcc.com
> 
> 
>

Article: 17194
Subject: Re: Tristate Register in Xilinx 4000XLA IO block
From: Phil Hays <spampostmaster@sprynet.com>
Date: Wed, 07 Jul 1999 18:49:39 -0700
Links: << >> << T >> << A >>

Hermann Winkler wrote:

> How can I use the new tristate flip-flop in the 4000XLA IOB?

If you really need it, look at the following application note:

http://www.xilinx.com/xapp/xapp123.pdf

It's not a very good answer.  The support for this feature is much
better in the Virtex parts.


-- 
Phil Hays
"Irritatingly,  science claims to set limits on what 
we can do,  even in principle."   Carl Sagan

Article: 17195
Subject: Re: Virtex: Excessive PAR run-times without user-feedback?
From: help@for.you
Date: Thu, 08 Jul 1999 04:20:14 GMT
Links: << >> << T >> << A >>

Hi,

your problem may come from the following factors:

1) The use of the -c 2 -d 2 switch. Those switches have to do with
routing clean up and delay improvement . With Virtex, Xilinx doesn't
recommand to use it because it takes forever to run with minimal delay
improvement. Don't use it. Put -c 0 -d 0.

2) Don't overconstraint your design. Your timing constraint MUST be
achievable. If not, placement and routing time increase dramatically.
I've seen more than 10X.

3) Your using multiple placement and route with the -n 5 switch. Do
only multiple placement by adding the -r switch. You'll save the
routing time. You'll still get timing report in the .par report file.
The timing approximation is within +/- 3% to the actual routed delay.
Select the best placement and then route it using the -k switch
(reentrant routing)

hope this help,

Gerard Auclair
Marconi communications

On 5 Jul 1999 14:19:16 GMT, koch@ultra1.eis.cs.tu-bs.de (Andreas Koch)
wrote:

>As an experiment, I am trying to prototype SUN's picoJava-II processor
>(sans caches and FPU) on a Virtex 1000.  However, PAR has already been
>working on the problem for 73h on a 300Mhz UltraSPARC-II machine and
>appears to be stuck after placement and the detection/disabling of
>circuit loops.
>
>I am very much willing to continue running PAR, but at the moment, it
>is not clear that anything useful is happening at all. The process has
>grown to over 700MB (no problem, this is a 1GB RAM machine) and
>does not perform any system calls (checked with truss).
>
>Should I be more patient, or is the tool just spinning its wheels?
>
>Thanks,
>  Andreas Koch
>
>

Article: 17196
Subject: Re: Altera 10K I/O's
From: Jerry Zdenek <zdenekjs@interaccess.com>
Date: Thu, 08 Jul 1999 00:40:41 -0500
Links: << >> << T >> << A >>

> The problem I'm experiencing is when any output of the bus is driving  logic
> high, it is influenced by neighboring data lines allowing 1.5V  switching
> for that entire signal level. When the output drives low, there is no
> switching noise or other influence. It behaves like a week open collector.
> If a termination resistor (pull-up or pull-down) is applied, (anywhere from
> 200 ohms to 50K) the output is stable. I have checked the PC board for trace
> resistance and shorts (the trace lengths are about 6 inches). Also tried
> several configurations with the output assignments, and the only one that
> didn't give me this problem is when configured as open drain (shouldn't have
> to do this). It's as if the outputs are unable to drive logic high without a
> load.

We had a problem where we were driving 16bits of data bus low all at
once and it was causing ground bounce on some control signals that just
happened to be assigned nearby pins.  Data bus was real slow, it was the
edge speed that was killing us.  

Sure sounds like power supply bounce to me.  Have you got all eight
bypass caps installed?  I've also been putting in a bulk Tantalum just
to be sure.

However, the easiest thing to do is to set the data bus to slow slew
rate and see if that clears things up, or as many signals that can
handle the longer delay as you can.

Jerry

Article: 17197
Subject: Re: PCI interface
From: Rickman <spamgoeshere4@yahoo.com>
Date: Thu, 08 Jul 1999 02:01:38 -0400
Links: << >> << T >> << A >>

Jo Van Langendonck wrote:
> 
> I am about to develop a PCI board for SUN SPARC stations. Our current
> designs use the SBUS, which is becoming obsolete. We will need very high
> burst transfer rates.
> Currently I'm doing a trade-off between AMCC, PLX, Xilinx PCI core,...
> Any technical comments?
> 
> Jo Van Langendonck
> Alcatel Bell Space

You might consider the OR3+ series parts from Lucent. They have a PCI
interface built in hardware as opposed to CLBs like the Xilinx parts. I
have not used them, so I can't comment on their efficiency or
effectiveness. 


-- 

Rick Collins

rick.collins@XYarius.com

remove the XY to email me.



Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design

Arius
4 King Ave
Frederick, MD 21701-3110
301-682-7772 Voice
301-682-7666 FAX

Internet URL http://www.arius.com

Article: 17198
Subject: Re: Programming Xilinx without Foundation
From: Le mer Michel <michel.lemer@ago.fr>
Date: Thu, 08 Jul 1999 09:13:05 +0200
Links: << >> << T >> << A >>

mel wrote:

> For development I'm quite happy programming the Xilinx 9536 using the
> Foundation software. However, now that I have two finished designs, I'm
> finding that the production department are confused by the Foundation
> software (they're quite happy wielding soldering irons about but they're
> not very PC literate).
>
> So, I'm wondering if there's an easy way to embed the jedec file and
> programming algorithm into a DOS program. This would provide the simplest
> possible way of programming an on-board device via the PC printer port
> programmer.
>
> --
>
>     /Mel/ (at work)

Hello

You can just install the software part for downloading. If the parameters
are saved, they will only need to open the icon and click the run button.

Good luck

Michel Le Mer
Gerpi sa (Xilinx Xpert)
3, rue du Bosphore
Alma city
35000 Rennes (France)
(02 99 51 17 18)
http://www.xilinx.com/company/consultants/partdatabase/europedatabase/gerpi.htm

Article: 17199
Subject: Re: Programming Xilinx without Foundation
From: mel@cix...co...uk (mel)
Date: Thu, 8 Jul 1999 08:53 +0100 (BST)
Links: << >> << T >> << A >>

brianp@xilinx.com (Brian Philofsky) wrote:

> There is a DOS-version of the JTAG Programmer called, jtagprog.exe.  

<doh!> Is this a recent addition to the Foundation software, or has it 
been there all along. I don't recall having seen it before, and I recently 
updated to 1.5i.

Thanks for the info, I'll give it a go today.

--

    /Mel/ (at work)

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search